Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New TMC docs: don't use hold_current; avoid interpolate #4977

Merged
merged 7 commits into from
Dec 12, 2021

Conversation

KevinOConnor
Copy link
Collaborator

This updates the TMC_Drivers.md document to make two notable recommendations:

  1. Don't use hold_current.
  2. For best positional accuracy, prefer using spreadCycle mode and set interpolate: False.

Details are in the proposed document: https://github.com/Klipper3d/klipper/blob/work-tmctuning-20211128/docs/TMC_Drivers.md

These documentation changes are the results of tests I've been running with the magnetic angle sensors and the accelorometer.

Both of the settings above introduce a positional error that is very small (a handful of microns), but there doesn't seem to be a compelling reason to pay a penalty, as there are alternative configurations available.

As part of this PR, I considered changing the interpolate default from True to False. However, I fear doing so may increase the audible noise of printers to the point that some users switch to stealthChop mode. That would be a net loss, as the positional error introduced by stealthChop mode is significantly larger then the penalty from interpolation (eg, 25x). So, the default remains interpolate: True after this PR.

Comments?
-Kevin

shown an increase in positional deviation of nearly a full-step when
using stealthChop mode (for example, on a printer with 40mm
rotation_distance and 200 steps_per_rotation, position deviation of
constant speed moves increased by ~0.160mm).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very interesting! This seems to be really a lot. And on paper, it looks like the motor is close to skipping a step. So, the question is, is it a random deviation, or a position lag? And if it is a lag, is it velocity- and current- and microstepping-dependent? Any available data would be very helpful. Basically, if the lag comes from internal processing in TMC driver, and not from the loss of torque, it might be a useful recommendation to set the same parameters for the drivers of all moving axes (e.g. same microstepping, and enabling stealthChop for all of them) so that the lag is consistent and reduces de-synchronization between the linear motion of different axes and the extruder, for instance. And if it is due to torque, it indicates, in general, a potentially unsafe operation (but if this lag exists during the constant velocity moves, the resistance to motion should be fairly small, and thus such motion should not require much torque?).

Then a separate question is Z axis, could this introduce a backlash effect and introduce systematic issues with the first layers (and further layers if Z hop or bed meshing is enabled).

@celtare21
Copy link

Is spreadCycle mode supposed to make quite a ear scratching screetch specially when not moved or on very little movement?
I've been running my z stepper with "stealthchop_threshold: 0.05".
Otherwise, the noise is totally unbearable, no matter if microsteps were at 16, 64 or 128.

@dalegaard
Copy link
Contributor

If your stepper is making a loud noise your switching frequency is too low. Trinamic has calculation sheets for their drivers where you plug in motor parameters and they give you back some good starting points for chopper configuration. Especially relevant if the chopper noise is audible is decreasing TOFF and potentially TBL as well.

@celtare21
Copy link

TOFF

is there a more in depth guide on how to do the tuning?

@KevinOConnor
Copy link
Collaborator Author

KevinOConnor commented Nov 30, 2021

Hi @dmbutyugin

This seems to be really a lot. And on paper, it looks like the motor is close to skipping a step. So, the question is, is it a random deviation, or a position lag?

In this case, I was referring to the "stepper lag" we discussed in #4560 . It's a reproducible position lag during constant velocity moves. I reran some tests for comparison purposes.

Current 1amp, 128 microsteps, interpolation off, spreadCycle:
moves_10
Current 1amp, 128 microsteps, interpolation off, stealthChop:
moves_10_st
When in stealthChop mode, the motors show a consistent deviation that is significantly larger than when in spreadCycle mode.

And if it is a lag, is it velocity- and current- and microstepping-dependent?

Seems not. Changing velocity only has a minor impact, changing current only has a minor impact, changing interpolation doesn't seem to have an impact (at least when at 128 microsteps), and the microsteps are already quite high (128).

Basically, if the lag comes from internal processing in TMC driver, and not from the loss of torque

It does seem that way to me.

it might be a useful recommendation to set the same parameters for the drivers of all moving axes (e.g. same microstepping, and enabling stealthChop for all of them) so that the lag is consistent and reduces de-synchronization between the linear motion of different axes and the extruder, for instance.

FWIW, I don't see how this would help. The lag seems to be a distance and not a time. I'd expect the X, Y, and E steppers to typically move with different velocities - so, a constant distance deviation will result in a notably different effective time lag. Thus I don't see a benefit in trying to synchronize the distance lag.

And if it is due to torque, it indicates, in general, a potentially unsafe operation

I'm not seeing an issue like that. I can still move the steppers at 200mm/s and still see similar amounts of lag with 0.5amps, 1.0amps, and 1.5 amps. I didn't see anything that indicates steps would be lost. (To "lose a step", the stepper would need to lag two full steps.)

Then a separate question is Z axis, could this introduce a backlash effect and introduce systematic issues with the first layers (and further layers if Z hop or bed meshing is enabled).

This is a good point. Indeed, I fear that translations like bed_mesh may not interact well with stealthChop. Alas, I don't have an angle sensor on this printer's Z axis, and it's not immediately easy for me to add one. It might be possible to put together a test on the XY axes (which do have angle sensors), but I suspect a good test should include the impact of gravity.

Any available data would be very helpful.

I reran some tests today.

Current 1amp, 128 microsteps, interpolation off, spreadCycle:
moves_10.index.gz
moves_10.json.gz
Current 1amp, 128 microsteps, interpolation off, stealthChop:
moves_10_st.json.gz
moves_10_st.index.gz
Current 0.5amp, 128 microsteps, interpolation off, stealthChop:
moves_05_st.index.gz
moves_05_st.json.gz
Current 1.5amp, 128 microsteps, interpolation off, stealthChop:
moves_15_st.json.gz
moves_15_st.index.gz

The graphs above were generated using:
./scripts/motan/motan_graph.py moves_10 -s 26 -d 7 -g '[["trapq(toolhead,x_velocity)"], ["deviation(angle(angle_x),stepq(stepper_x))"]]'

The runs are all using this gcode file:
moves.gcode.gz

The move.gcode is unchanged from the discussion in PR #4560, so the data files there with various spreadCycle settings should still apply.

I can run additional tests if you have something in mind. This printer (a Voron Zero) has an adxl345 on the toolhead and angle sensors on X, Y, E.

Thanks,
-Kevin

@pedrolamas
Copy link
Contributor

I'm all in for accuracy, but regarding the removal of hold_current, how bad will the impact be in terms of heat from the motors?

My point is that I can see some users complaining about Klipper stopping due to the motors overheating and thus prefer to use hold_current as a "better this than not working" solution... and that probably should be very clear on the docs.

@jakep82
Copy link
Collaborator

jakep82 commented Nov 30, 2021

I'm all in for accuracy, but regarding the removal of hold_current, how bad will the impact be in terms of heat from the motors?

My point is that I can see some users complaining about Klipper stopping due to the motors overheating and thus prefer to use hold_current as a "better this than not working" solution... and that probably should be very clear on the docs.

If your motors are overheating, then your run_current is too high.

@dmbutyugin
Copy link
Collaborator

@KevinOConnor, yes, I remember that conversation, now I understand that you were referring to it.

When in stealthChop mode, the motors show a consistent deviation that is significantly larger than when in spreadCycle mode.

Yes, I understand. I believe that this lag definitely exists. It is just that magnitude of this lag is somewhat surprising and counter-intuitive, is all. Unfortunately, I don't have angle sensors and therefore I cannot test how it looks like on my printer. ADXL345 isn't necessarily helpful here.

I am also confused about the amount of lag. I've been using StealthChop2 mode on my Ender3 pretty much from the very start, and I never had issues with clearances, including walls not parallel to X and Y axes (so the typical achievable tolerances were 0.1 mm or 0.15 mm). If that lag were to develop, I'd probably be running into some clearance problems with connecting parts once in a while.

And if it is a lag, is it velocity- and current- and microstepping-dependent?

Seems not. Changing velocity only has a minor impact, changing current only has a minor impact, changing interpolation doesn't seem to have an impact (at least when at 128 microsteps), and the microsteps are already quite high (128).

This part seems counter-intuitive to me. Basically, the fact that there is no proportionality between the lag and the current or velocity. However, if the lag was due to some physical processes in the system, I'd expect it to decrease with current (if it was due to lower torque or stepper dead zones), or increase with velocity (the friction should increase linearly or super-linearly, e.g. quadratically, with velocity). This does not happen however.

Basically, if the lag comes from internal processing in TMC driver, and not from the loss of torque

It does seem that way to me.

Perhaps. But it would look like it is some almost fixed part of the rotation between the phases. It is unclear to me what kind of buffering or delayed processing would result in this, given that the amount of lag, and therefore delayed microsteps, is quite large.

it might be a useful recommendation to set the same parameters for the drivers of all moving axes (e.g. same microstepping, and enabling stealthChop for all of them) so that the lag is consistent and reduces de-synchronization between the linear motion of different axes and the extruder, for instance.

FWIW, I don't see how this would help. The lag seems to be a distance and not a time. I'd expect the X, Y, and E steppers to typically move with different velocities - so, a constant distance deviation will result in a notably different effective time lag. Thus I don't see a benefit in trying to synchronize the distance lag.

Oh, yes, true.

And if it is due to torque, it indicates, in general, a potentially unsafe operation

I'm not seeing an issue like that. I can still move the steppers at 200mm/s and still see similar amounts of lag with 0.5amps, 1.0amps, and 1.5 amps. I didn't see anything that indicates steps would be lost. (To "lose a step", the stepper would need to lag two full steps.)

The stepper needs to 'lag' two full steps. But if I am not mistaken, the torque rises when the lag increases from 0 to 1 full step, and then drops when the lag continues to increase from 1 to 2 full steps, right? Then if the external force exceeds the motor torque at 1 full step lag, it would immediately skip by 2 full steps, because the torque only decays afterwards (well, strictly speaking, if the external force decays faster than the torque, then the stepper won't skip and can operate with a lag more than 1 full step; but I'm not sure how frequently that can happen in practice).

Then, the fact that lag does not decrease with an increase of current, likely indicates that this is not torque-related. At least, if the stepper driver does not artificially limit the current during operation. But you can check the latter by touching the stepper - at high currents, it is expected to get more hot.

BTW, I was reading TMC2209 and TMC5160 datasheets about StealthChop2 operation, and nothing there indicates some current adjustments like that. Unless CoolStep is enabled, but Klipper does not use that. In fact, StealthChop2 seems to be a voltage PWM mode, which uses simple proportional regulator to adjust the current (with pwm_autoscale = 1 and pwm_auto_grad = 1). The datasheets were not clear about any averaging for the regulator, but there is probably something in place, cause PWM frequency is 58.5kHz by default in Klipper? In any case, I'd be surprised to learn that the stepper driver introduces some significant lag here of more than 75% of the full step due to PWM regulation.

There is only one thing I can think of that is suboptimal in Klipper w.r.t. StealthChop2 autotuning: "Just two steps have to be respected by the motion controller for best results: Start with the motor in standstill, but powered with nominal run current (AT#1). Move the motor at a medium velocity, e.g. as part of a homing procedure (AT#2)." (from the datasheet). But I think Klipper does not respect the initial standstill during homing (> 130 ms). Or does it? But I doubt this has any effect on the lag in question.

I can run additional tests if you have something in mind. This printer (a Voron Zero) has an adxl345 on the toolhead and angle sensors on X, Y, E.

I can think of 2 tests:

  1. All tests so far have motion in alternating direction and changing velocity. But it would be great to run two or more acceleration phases in the same direction to see how lag behaves during acceleration (when there is significant load on the stepper), perhaps at two different currents (e.g. 0.5 Amp and 1.0 Amp). During deceleration, it simply crosses 0 and changes the sign, and I'm afraid the data there is not very telling.
  2. Test a StealthChop lag in Velocity Based Scaling mode (pwm_autoscale = 0). In this case, any regulation is disabled, so it may affect the lag. It is only necessary to calculate PWM_OFS and PWM_GRAD registers from the known motor parameters as per datasheet. Of course, the actual motor parameters may be known inaccurately, and plus this is follows a linear approximation of sorts, but since it's for testing only, I think that should be OK. And it would be very interesting to see how the lag develops here.

But in any case, it is worthwhile to warn the users about this potential large lag, even if we don't understand its origin. So, this discussion shouldn't block the PR itself.

@KevinOConnor
Copy link
Collaborator Author

how bad will the impact be in terms of heat from the motors?

Thanks for the feedback.

My high-level understanding is that hold_current does not provide a notable cooling benefit during prints. So, it seems there isn't a reason to "pay a penalty" for hold_current (even if the penalty is small).

It's hard to make general statements about 3d printers, because there are tons of different printers out there, and behavior can vary from print to print. However, with the current default settings of TPOWERDOWN and IHOLDDELAY on a tmc2209, a power reduction would only start after 633ms of idle time - and likely only reach the requested hold current after about 800ms or so. For XY motors, I suspect very few moves would involve one of the steppers inactive for that amount of time. Given that most people set the hold_current to around 75% of the run_current anyway, I'd guess using a hold_current provides marginal cooling. The timings/defaults are similar for other TMC drivers. I expect the extruder to have a similar pattern. If bed_mesh or bed_tilt is used, then the Z stepper wont idle and wont see a benefit to hold_current.

I have no plans to remove the hold_current capability. So, if people want to use it, they certainly can. From a documentation point of view, though, I think we can provide a clear and simple message - if you haven't investigated hold_current and know what its limitations are, then don't use it. I suspect a lot of people enable hold_current today, because "oh that sounds useful", and my recent investigation has led me to believe that sentiment isn't really accurate.

Cheers,
-Kevin

@Gilabite
Copy link

Gilabite commented Dec 1, 2021

Ive been running my corexy (TMC5160s) like this on all motors since this was posted and my motor temps are the same as before.

@KevinOConnor
Copy link
Collaborator Author

I've been using StealthChop2 mode on my Ender3 pretty much from the very start, and I never had issues with clearances, including walls not parallel to X and Y axes (so the typical achievable tolerances were 0.1 mm or 0.15 mm). If that lag were to develop, I'd probably be running into some clearance problems with connecting parts once in a while.

FWIW, I'm not sure the lag would show up as a dimensional error. I'd guess the toolhead is mostly following the commanded path, just not with great extruder synchronization and maybe with some minor dimensional issues for diagonal moves. It might show up as minor blobbing issues; and it's possible people are tuning some of that out with PA and flow_rate calibration. Admittedly, I'm "hand waving" a lot here. The data from the angle sensor is pretty convincing though.

It is unclear to me what kind of buffering or delayed processing would result in this, given that the amount of lag, and therefore delayed microsteps, is quite large.
...
I was reading TMC2209 and TMC5160 datasheets about StealthChop2 operation, and nothing there indicates some current adjustments like that.

FWIW, I've spent a lot of time parsing the TMC specs, and I feel they are equal parts "hand waving", "technical information", and "marketing product brochure". Often when I read them I end up with more questions than when I started. So, take what I say "with a grain of salt".

I didn't get the impression that there is any kind of "command buffering"- I'd guess it's more about a systemic lag developing from its prediction system. If I understand stealthChop correctly, it measures how quickly it can push current through the coils, just as spreadCycle does. However, it doesn't use those measurements to directly control current, but instead uses them to make future predictions on how long to enable power to the coils. As a wild guess, its prediction system is generally accurate (after all, it is ultimately moving from point A to point B with the requested velocity), but is inducing some kind of lag?

There is only one thing I can think of that is suboptimal in Klipper w.r.t. StealthChop2 autotuning: "Just two steps have to be respected by the motion controller for best results

Well, my read of the specs (for what that's worth) is that stealthChop is always in a "tuning mode" and thus it's not really important that the pause comes before the homing move or vice-versa. So, I don't think there is a practical issue, as there will be a pause (and a healthy mix of moves at various speeds) leading up to the print.

I can think of 2 tests:

Okay, if I get a chance, I'll try to put something together.

Thanks.
-Kevin

@pedrolamas
Copy link
Contributor

Ive been running my corexy (TMC5160s) like this on all motors since this was posted and my motor temps are the same as before.

Can confirm the same findings with my Ender-3 V2 with an SKR2 running TMC2209s.

@dmbutyugin
Copy link
Collaborator

FWIW, I'm not sure the lag would show up as a dimensional error. I'd guess the toolhead is mostly following the commanded path, just not with great extruder synchronization and maybe with some minor dimensional issues for diagonal moves. It might show up as minor blobbing issues; and it's possible people are tuning some of that out with PA and flow_rate calibration. Admittedly, I'm "hand waving" a lot here. The data from the angle sensor is pretty convincing though.

If the move is, say, at 30 degrees to X axis, and the lag is 0.15 mm on X and Y stepper of a Cartesian printer, that should manifest itself as as a line being offset from its intended position. If I didn't mess up the calculation, the offset = lag * (cos(alpha) - sin(alpha)) ~= 0.055 mm in such case. OK, TBH, it is not that much, and perhaps can go unnoticed.

FWIW, I've spent a lot of time parsing the TMC specs, and I feel they are equal parts "hand waving", "technical information", and "marketing product brochure". Often when I read them I end up with more questions than when I started. So, take what I say "with a grain of salt".

That is fair. It is likely the case that Trinamic is protecting their IP, so they refrain from revealing too much of details about StealthChop operation.

I didn't get the impression that there is any kind of "command buffering"- I'd guess it's more about a systemic lag developing from its prediction system. If I understand stealthChop correctly, it measures how quickly it can push current through the coils, just as spreadCycle does. However, it doesn't use those measurements to directly control current, but instead uses them to make future predictions on how long to enable power to the coils. As a wild guess, its prediction system is generally accurate (after all, it is ultimately moving from point A to point B with the requested velocity), but is inducing some kind of lag?

If I read the available data correctly, the tuning process tries to find the optimal values for PWM_OFS_AUTO and PWM_GRAD_AUTO parameters. These values are then used in a proportional regulator to adjust PWM_SCALE_AUTO value. That scale (with PWM_OFS_AUTO) is multiplied by the target sine wave value in order to determine the actual PWM duty cycle. So, unlike SpreadCycle chopper, which regulates the current during each chopper cycle (on, slow decay, fast decay), here the cycles are not regulated individually, but instead the duty cycle is slowly adjusted in a hope to get the requested sine amplitude (and IRMS).

Given that the PWM duty cycle is determined as PWM_SUM * sin(t), it is not likely that the system develops significant lag trying to 'push' the current through coils. It always pushes a proper sine wave, but if PWM_SUM is insufficient, it may lead to lower amplitude sine wave through the coils and thus lower torque of the stepper. However, I'd expect that PWM_SUM stabilizes during the constant velocity operation, and thus the lag be reduced similar to SpreadCycle. But it stays constant pretty much throughout the whole constant velocity moves. So, a mystery.

Well, my read of the specs (for what that's worth) is that stealthChop is always in a "tuning mode" and thus it's not really important that the pause comes before the homing move or vice-versa. So, I don't think there is a practical issue, as there will be a pause (and a healthy mix of moves at various speeds) leading up to the print.

Ah, you are right: "Automatic tuning adapts to changed conditions whenever AT#1 and AT#2 conditions are fulfilled in the later operation."

@KevinOConnor
Copy link
Collaborator Author

FYI, I ran an additional test on the Zero. I used the following macro to test moves and pauses largely in the same direction:

[gcode_macro MOVES_TEST]
gcode:
    {% set XYZ_start = (110, 110, 20) %}
    {% set accels = [500, 3000, 7000, 10000] %}
    {% set pos_list = [95.01, 82.02, 69.03, 55.04, 40.05, 20.06, 10.07, 25.08, 35.10, 52.11, 67.12, 72.13, 87.14, 90.16, 100.18, 110] %}
    {% set speed = 100 %}

    # Test code
    {% set initspeed = 20 %}
    G28
    G1 P500
    SET_VELOCITY_LIMIT ACCEL_TO_DECEL=999999
    G1 X{XYZ_start[0]} Y{XYZ_start[1]} Z{XYZ_start[2]} F{initspeed * 60}
    G4 P200

    {% for accel in accels %}
        SET_VELOCITY_LIMIT ACCEL={accel}
        {% for pos in pos_list %}
            G1 X{pos} Y{pos} F{speed * 60}
            G4 P100
        {% endfor %}
    {% endfor %}
    G4 P2000
    M84

SpreadCycle mode (1amp, 128 microsteps, interpolation off):
mytest
StealthChop mode (1amp, 128 microsteps, interpolation off):
mtest_st

In general, it looks to me that different acceleration levels don't notably impact the lag. There is a small measurable and repeatable inaccuracy when stopping at different microstep positions. That inaccuracy is similar between stealthChop and spreadCycle. StealthChop continues to show a repeatable high distance lag during moves.

It is only necessary to calculate PWM_OFS and PWM_GRAD registers from the known motor parameters as per datasheet.

I'm not seeing how one would calculate that. FWIW, the X, Y, E motors in this machine are https://www.omc-stepperonline.com/download/14HS20-1504S.pdf and I see the following from DUMP_TMC after the above stealthchop test:

PWM_AUTO: 001c0055 pwm_ofs_auto=85 pwm_grad_auto=28
PWM_SCALE: 0000002f pwm_scale_sum=47
PWMCONF: c80d0e24 pwm_ofs=36 pwm_grad=14 pwm_freq=1 pwm_autoscale=1 pwm_autograd=1 pwm_reg=8 pwm_lim=12
DRV_STATUS: c0110000 cs_actual=17 stealth=1 stst=1
CHOPCONF: 21010053 toff=3 hstrt=5 tbl=2 mres=1(128usteps) dedge=1

Raw data:
mtest_st.json.gz
mtest_st.index.gz
mtest.index.gz
mtest.json.gz
Graphs above produced with: scripts/motan/motan_graph.py mtest -g '[["trapq(toolhead,x_velocity)"], ["deviation(angle(angle_x),stepq(stepper_x))"]]' -s 12 -d 24 (using the work-angle-20210722 branch).

-Kevin

@dmbutyugin
Copy link
Collaborator

@KevinOConnor Thanks!

Sorry, I meant more like a test (e.g. starting from X=0, Y=0)

G0 X40 Y0 F3000
G0 X80 Y0 F6000
G0 X120 Y0 F9000

So, there are consecutive accelerations of the same magnitude in the same direction.

As for the parameters, from the datasheets (I assume 24 V PSU):

PWM_OFS = 374 * R_COIL * I_COIL / V_M = 374 * 2.8 (Ohm) * 1 (Amp) / 24 (Volt) = 43.6 (44?)

This is called PWM_AMPL in TMC2209 datasheet by mistake in one place I assume (should have been PWM_OFS).

PWM_GRAD = C_BEMF * 2 * pi * f_CLK * 1.46 / (V_M * MSRP),
with MSRP - microsteps per rotation (128 * 200 = 25600), and
C_BEMF = Hold_Torque / (2 * I_COIL_NOM)

Combining together, and assuming 12MHz internal clock being used by the driver (is it?),

PWM_GRAD = 0.4 (N*m) / (2 * 1.5 (A)) * 2 * 3.14 * 12e6 (Hz) * 1.46 / (24 (Volt) * 25600) = 23.88 (24?)

This mode requires setting pwm_autoscale = 0.

@KevinOConnor
Copy link
Collaborator Author

KevinOConnor commented Dec 2, 2021

Wow. I was playing with the graphs and noticed this in the stealthChop data:
mtest_st_sc

The blue line (which is a calculated derivative on the raw angle data) shows jitter near the places where the orange line (which is a calculated derivative on the raw step pulses sent) shows jitter.

The orange line has jitter due to step compression - the Klipper host code is allowed to move a step pulse time around within a window of 25us in order to reduce bandwidth. Zooming in we can see an even closer correlation:
mtest_st_sc2

This is a bit off topic here (and I don't think it has any relevance to the main findings in this PR), but I was a bit stunned that the angle sensors could pick up the impact of step compression. This is timing differences in a handful of microseconds!

-Kevin

First graph produced with: scripts/motan/motan_graph.py mtest_st -g '[["derivative(angle(angle_x))","derivative(stepq(stepper_x))"]]' -s 17.25 -d .4 and second with scripts/motan/motan_graph.py mtest_st -g '[["derivative(angle(angle_x))","derivative(stepq(stepper_x))"]]' -s 17.4 -d .17

EDIT: FYI, it's also possible to overlay the requested stepper velocity with something like: scripts/motan/motan_graph.py mtest_st -g '[["derivative(angle(angle_x))","derivative(stepq(stepper_x))","derivative(kin(stepper_x))?alpha=.4"]]' -s 17.25 -d .4

@KevinOConnor
Copy link
Collaborator Author

So, there are consecutive accelerations of the same magnitude in the same direction.

I'm not sure I understand. The test does move the head in the same direction - it moves thorough these positions:
95.01, 82.02, 69.03, 55.04, 40.05, 20.06, 10.07, 25.08, 35.10, 52.11, 67.12, 72.13, 87.14, 90.16, 100.18, 110 with a 100ms pause between each position. Were you looking for different velocities between positions and not a pause? (Note that this is a corexy, so I generated moves to isolate the "x+y" motor.)

As for the parameters, from the datasheets (I assume 24 V PSU):

Ah, I must have missed that - do you have a pointer to the datasheet and its location?

Yes, 24V.

12MHz internal clock being used by the driver (is it?)

Yes, the tmc2209 should have a nominal clock of 12Mhz.

-Kevin

@dmbutyugin
Copy link
Collaborator

Were you looking for different velocities between positions and not a pause? (Note that this is a corexy, so I generated moves to isolate the "x+y" motor.)

Yes, different velocities without any pauses. *Yeah, provided was just an example, naturally for corexy it should be diagonal moves.

Ah, I must have missed that - do you have a pointer to the datasheet and its location?

Yes,

TMC2209: https://www.trinamic.com/fileadmin/assets/Products/ICs_Documents/TMC2209_Datasheet_rev1.06.pdf
6.4 Velocity Based Scaling, starting on page 41.

TMC5160: https://www.trinamic.com/fileadmin/assets/Products/ICs_Documents/TMC5160A_Datasheet_Rev1.15.pdf
7.4 Velocity Based Scaling, starting on page 63.

The orange line has jitter due to step compression - the Klipper host code is allowed to move a step pulse time around within a window of 25us in order to reduce bandwidth. Zooming in we can see an even closer correlation:

Oh, well. I knew about this optimization, but I didn't put the magnitude into perspective previously. Actually, the time between steps at 128 microstepping at 200 mm/sec with 1.8 steppers with 40 mm rotation distance can be 0.2 mm / 200 mm/s / 128 = 7.8 us if I didn't mess the computations. So, this is noticeably less than the 'optimization window'. I wonder if this could affect microstepping accuracy? With 16 microstepping the time between steps is 62.5 us in this case, but now we start recommending to disable interpolation and use high microstepping values if possible. I wonder if the optimization should account for the time between consecutive steps? Or maybe it does already?

Separately, I think this could negatively affect StealthChop regulator. I imagine that in the automatic mode, it may still be using
PWM_SCALE_SUM = PWM_OFS_AUTO + PWM_GRAD_AUTO * 256 * f_STEP / f_CLK,
and so changing f_STEP erratically can lead to spikes in the PWM'ed voltage and current that StealthChop proportional regulator is not able to sufficiently compensate by adjusting PWM_GRAD_AUTO. This can lead to erratic torque changes, which could explain the jitter in the velocity you have observed.

@KevinOConnor
Copy link
Collaborator Author

KevinOConnor commented Dec 2, 2021

Thanks.

I wonder if the optimization should account for the time between consecutive steps? Or maybe it does already?

Yes - the window is the lower of 25us or half the time since the last step. I also noted the high step frequency last night and tried reducing this check to an 8th:

--- a/klippy/chelper/stepcompress.c
+++ b/klippy/chelper/stepcompress.c
@@ -91,7 +91,7 @@ minmax_point(struct stepcompress *sc, uint32_t *pos)
 {
     uint32_t lsc = sc->last_step_clock, point = *pos - lsc;
     uint32_t prevpoint = pos > sc->queue_pos ? *(pos-1) - lsc : 0;
-    uint32_t max_error = (point - prevpoint) / 2;
+    uint32_t max_error = (point - prevpoint) / 8;
     if (max_error > sc->max_error)
         max_error = sc->max_error;
     return (struct points){ point - max_error, point };

It showed a minor improvement, but there was clearly still an interaction with step compression.

Separately, I think this could negatively affect StealthChop

I agree.

I also noticed that the vibrations are showing up on the toolhead accelerometer ( ./scripts/motan/motan_graph.py mtest_st -g '[["derivative(angle(angle_x))","derivative(stepq(stepper_x))","derivative(kin(stepper_x))?alpha=.4"],["adxl345(adxl345,x)"],["deviation(angle(angle_x),kin(stepper_x))"]]' -s 17.25 -d .4 ), so it does appear Klipper could do better here.

I also tried going down to 16 microsteps with interpolation enabled - alas that seems to show reproducible lost steps in the 10K accel part of the test. So, clearly not an improvement in general (though I don't know if it's related to step compression or not).

-Kevin

@KevinOConnor
Copy link
Collaborator Author

@dmbutyugin - FYI, I ran a few more tests. I added the following to both the stepper_x and stepper_y tmc sections:

driver_PWM_OFS: 44
driver_PWM_GRAD: 24
driver_PWM_AUTOSCALE: False
driver_PWM_AUTOGRAD: False

At a glance, it didn't seem to have a noticeable impact.
mtest_st_manual.json.gz
mtest_st_manual.index.gz

I also created a new "variable speed" macro:

[gcode_macro SPEEDS_TEST]
gcode:
    {% set XYZ_start = (110, 110, 20) %}
    {% set accels = [500, 3000, 7000, 10000] %}
    {% set pos_list = [95.01, 82.02, 69.03, 55.04, 40.05, 20.06, 10.07, 25.08, 35.10, 52.11, 67.12, 72.13, 87.14, 90.16, 100.18, 110] %}
    {% set speeds = [20, 80, 40, 100] %}

    # Test code
    {% set initspeed = 20 %}
    G28
    G1 P500
    SET_VELOCITY_LIMIT ACCEL_TO_DECEL=999999
    G1 X{XYZ_start[0]} Y{XYZ_start[1]} Z{XYZ_start[2]} F{initspeed * 60}
    G4 P200

    {% for accel in accels %}
        SET_VELOCITY_LIMIT ACCEL={accel}
        {% for pos in pos_list %}
            G1 X{pos} Y{pos} F{speeds[loop.index0 % speeds|length] * 60}
        {% endfor %}
    {% endfor %}
    G4 P2000
    M84

Which does seem to show some quirky results (scripts/motan/motan_graph.py stest_st -g '[["trapq(toolhead,x_velocity)"], ["deviation(angle(angle_x),stepq(stepper_x))"]]' -s 10 -d 35):
stest_st

Raw logs (stealthchop; no customization):
stest_st.json.gz
stest_st.index.gz
Raw logs (spreadCycle):
stest.json.gz
stest.index.gz

-Kevin

@susisstrolch
Copy link

Typo in your gcode?
Shouldn‘t it be
G28
G4 P500

And a question behind: how does DEDGE influence the measurement?

@Sineos Sineos mentioned this pull request Dec 6, 2021
@Sineos
Copy link
Collaborator

Sineos commented Dec 6, 2021

Have you thought about bringing your findings to Trinamic's attention?

@DroneMang
Copy link

DroneMang commented Dec 7, 2021

After upgrading to Klipper 0.10.0 my steppers started making a high pitch squealing noise if not moving, so I specified a hold current. SKR Pro and TMC5160's. Before the upgrade I had hold current the same as run current for all but Z axis (hold current was specified but same). I had them all in spreadcycle but a mix of interpolate on and off.

I have been experimenting with stealthchop but just turned off stealth chop and interpolate on all axes and now no squealing.

On my SKR Pro, I have been running 0 for the step pulse on all steppers for a long time, never had any issues, though one user reported layer shifts. Doing this allows a higher microstep setting for the same speeds before Klipper throws an error. Going from 2 to 0 roughly doubled the print speed I could go, or doubled the possible microstep setting. This would make up some of the difference in loosing interpolate.
Thanks!

@djamu
Copy link

djamu commented Dec 11, 2021

If I may add following to this discussion.
Over the last years I've been developing an adaptive closed-loop controller / driver that is able to drive cheap / common low impedance stepper motors under load up to 2500RPM and has a resolution of 819200 steps / revolution ( 4096 uSteps ).
Last year I've ported it to use with TMC drivers, not using the step/dir interface, but the SPI XDIRECT interface, that allows for direct manipulation of winding current, and offers a more precise control and less latency. The driver has more in common with a stereo synthesiser then with a stepper driver, in essence the TMC driver becomes a precise dual current controller.

There are a couple of observations I whish to share:

  1. Don't assume your motor response is linear, the magnetic field across poles differ due to mechanical and electrical material properties, differences in wire gauge, differences in magnets, different amount of windings on poles, different electrical properties of H bridge driver, etc.....
    The following graph shows the average positional offset of a random 200 step / rev bipolar stepper motor. Measurement is 16x oversampled ( 8 times fwd / 8 bckw )
    X scale is 200 steps, Y scale is 32 usteps/step.
    Visible is the cyclic offset that repeats itself 50 times / rev, and the total offset.
    Orange line are the measured values.
    Blue are the normalised values.
    Yellow is the cyclic averaged.
    motor-correction-1stpass-lastpass-spline

As one can see the deviation in positional error can be quite substantial. Max error in this case is 9, which means that when using a 32mm pully ( 16 theeth gt2 ) the print error for this motor is 0.045mm without load.
For a stepper motor to run at 2500RPM I needed to make it's response linear as correct timing is crucial.

  1. StealthChop. While being quite, I can confirm this is bad for positional error.
    The main reason for this is ( if I recall correctly ), is in its asynchronous nature. In StealthChop, the chopper works asynchronously, this induces current in its neighbouring poles, that causes these to generate a parasitic magnetic field that offsets motor. It's just EMI / back EMF, that gets worse when upping current.

my 5 cents

@dmbutyugin
Copy link
Collaborator

@djamu Thanks a lot for sharing your data, this is indeed very interesting!

Max error in this case is 9, which means that when using a 32mm pully ( 16 theeth gt2 ) the print error for this motor is 0.045mm without load.

Typical positional accuracy on many stepper datasheets is +/-5% (non-accumulating), which would give +/-That is up to +/- 13 msteps / step, or +/- 0.01 mm with 20 teeth pulley on the full steps, and likely more in the intermediate positions. So, that looks very believable and likely more or less in line with the expected accuracy of a stepper motor.

The main reason for this is ( if I recall correctly ), is in its asynchronous nature. In StealthChop, the chopper works asynchronously, this induces current in its neighbouring poles, that causes these to generate a parasitic magnetic field that offsets motor.

I likely don't fully understand all details here. Do you mean that the StealthChoppers for two coils are not synchronized and can affect each other?

After upgrading to Klipper 0.10.0 my steppers started making a high pitch squealing noise if not moving, so I specified a hold current.

In general, the issue with SpreadCycle is that it, technically, requires some tuning. There are recommended starting parameters, but they may not work well for all stepper motors. StealthChop2 (on TMC2209 and TMC5160, at least) pretty much works fully autonomously and auto-tunes its parameters. So, if we suggest users to switch to SpreadCycle, it would be beneficial to suggest a tuning procedure. There are some suggestions from Trinamic, e.g. "Quick Configuration Guide" chapter in the TMC2209 and TMC5160 datasheets, and
https://www.trinamic.com/fileadmin/assets/Support/AppNotes/AN001-SpreadCycle.pdf
Those are a bit involved and require an oscilloscope, for instance. But I wonder if this tuning can be assisted with a help of an accelerometer and/or an angle sensor?

@KevinOConnor
Copy link
Collaborator Author

Typo in your gcode?
Shouldn‘t it be
G28
G4 P500

Yes. Good catch. Thankfully it doesn't impact the test results.

And a question behind: how does DEDGE influence the measurement?

Not at all as far as I'm aware.

Have you thought about bringing your findings to Trinamic's attention?

FWIW, I hadn't thought about that.

Thanks,
-Kevin

@KevinOConnor
Copy link
Collaborator Author

@djamu - interesting, thanks. I too see the same cyclical behavior on my steppers. If looking at the top graph at #4977 (comment) :

moves_10

One can see that the angle sensors report readings that vary over a range of about 20um when the stepper is moving. If one were to "zoom into a smaller time range" then the graph would show that "noise" pattern is actually a repeating cyclical pattern not unlike your graph.

pattern

It's not angle sensor noise - the noise from the sensor itself is about 5um - as can be seen at the end of the test when the motor isn't moving. These graphs were generated after angle sensor calibration, and the waves during movement correlate well with vibrations picked up by the accelerometer on the toolhead.

FWIW, in addition to the sources of error you've pointed out (imbalances in windings, magnets, teeth) I suspect detent forces in the stepper also lead to some of the recurring position errors. (That is, the permanent magnet in the rotor pulls toward the iron teeth in the stator - and that force varies depending on the requested position of the stepper motor.) FWIW, it may be possible to tune the sine wave function on the tmc2130/tmc5160 drivers to account for some of these errors.

-Kevin

Graph is from ./scripts/motan/motan_graph.py stest-20211129/moves_10 -s 29.1 -d .25 -g '[["trapq(toolhead,x_velocity)"], ["deviation(angle(angle_x),stepq(stepper_x))"]]' on work-angle-20210722 branch.

@KevinOConnor
Copy link
Collaborator Author

In general, the issue with SpreadCycle is that it, technically, requires some tuning.

FWIW, I've tried tuning spreadCycle a few times (on a few different printers) and I've never been successful in measuring a mechanical improvement over the default settings. I suspect that the trinamic docs may be orientated at big industrial motors and common small 3d printer steppers may not require active configuration.

Those are a bit involved and require an oscilloscope, for instance. But I wonder if this tuning can be assisted with a help of an accelerometer and/or an angle sensor?

I did attempt to do just that. Much of the results I've posted here were actually found during my attempt to come up with a mechanism to tune spreadCycle.

Alas, no matter how badly I mis-tune the spreadCycle settings on this particular test printer, I've been unable to measure an induced mechanical jitter. (My goal was to measure mechanical jitter as a proxy for a current probe / oscilloscope.) Neither the angle sensors nor the accelerometer show a strong signal when changing driver_TBL or driver_HEND. If I get further time, I may also try running the results through a FFT to see if that can pick up a signal. However, I suspect the low resistance steppers on this printer may operate fine even at the most aggressive tmc settings.

FWIW, I can measure a strong audio signal using the "spectroid app" on my android phone with varying driver_TOFF. Which was nice to see. However, there is a bit of a "myth" that one can make spreadCycle quiet with tuning. In my experience, if spreadCycle mode is loud with the default settings then it's likely to be loud with all settings. It depends on the type of noise - if one is directly hearing the chopper frequency - which sounds like a pure musical note - almost like a note played on a flute - then it is possible to tune that away. I suspect it is unlikely one would get that with the default settings though. In contrast, if one is hearing a "warbling hissing sound" then it seems that is a "secondary resonance" created by the chopper and I've not seen success in tuning that away. It seems some printer frames/steppers/electronics don't have the issue - for example, my Voron2 is basically silent with spreadCycle mode, while my delta and the makergear m2 were incredibly loud. YMMV.

-Kevin

@Taudris
Copy link

Taudris commented Dec 12, 2021

With my E3D 0.9 degree 48mm steppers, TMC5160, VM at 24V, SKR 1.3, I was able to see a significant improvement in surface quality by changing HSTRT and HEND to 3 and 3. I also observed a less "sickly" sound to the motor movement, and my printer is now able to reach higher speeds without skipping steps. (It's not any quieter, though. I'm also unable to use these motors in StealthChop mode; it has terrible resonance and reduced torque, and I've not been able to get it to work correctly by tuning it. Not that I really know what I'm doing... The datasheet makes for some heavy reading for a hobby.)

The attached model is a slightly conical cylinder (smaller at the top) that is 2 full steps of movement smaller at the top, assuming 32mm rotation distance. Slice it as a shell or in vase mode. The model is 40mm tall, but in my case, the difference was so stark that I'd need only 5mm or so of the height to see it.

Microstep Quality.zip

@Tahx
Copy link

Tahx commented Dec 12, 2021

Hi @dmbutyugin

This seems to be really a lot. And on paper, it looks like the motor is close to skipping a step. So, the question is, is it a random deviation, or a position lag?

In this case, I was referring to the "stepper lag" we discussed in #4560 . It's a reproducible position lag during constant velocity moves. I reran some tests for comparison purposes.

Current 1amp, 128 microsteps, interpolation off, spreadCycle: moves_10 Current 1amp, 128 microsteps, interpolation off, stealthChop: moves_10_st When in stealthChop mode, the motors show a consistent deviation that is significantly larger than when in spreadCycle mode.

And if it is a lag, is it velocity- and current- and microstepping-dependent?

Seems not. Changing velocity only has a minor impact, changing current only has a minor impact, changing interpolation doesn't seem to have an impact (at least when at 128 microsteps), and the microsteps are already quite high (128).

Basically, if the lag comes from internal processing in TMC driver, and not from the loss of torque

It does seem that way to me.

it might be a useful recommendation to set the same parameters for the drivers of all moving axes (e.g. same microstepping, and enabling stealthChop for all of them) so that the lag is consistent and reduces de-synchronization between the linear motion of different axes and the extruder, for instance.

FWIW, I don't see how this would help. The lag seems to be a distance and not a time. I'd expect the X, Y, and E steppers to typically move with different velocities - so, a constant distance deviation will result in a notably different effective time lag. Thus I don't see a benefit in trying to synchronize the distance lag.

And if it is due to torque, it indicates, in general, a potentially unsafe operation

I'm not seeing an issue like that. I can still move the steppers at 200mm/s and still see similar amounts of lag with 0.5amps, 1.0amps, and 1.5 amps. I didn't see anything that indicates steps would be lost. (To "lose a step", the stepper would need to lag two full steps.)

Then a separate question is Z axis, could this introduce a backlash effect and introduce systematic issues with the first layers (and further layers if Z hop or bed meshing is enabled).

This is a good point. Indeed, I fear that translations like bed_mesh may not interact well with stealthChop. Alas, I don't have an angle sensor on this printer's Z axis, and it's not immediately easy for me to add one. It might be possible to put together a test on the XY axes (which do have angle sensors), but I suspect a good test should include the impact of gravity.

Any available data would be very helpful.

I reran some tests today.

Current 1amp, 128 microsteps, interpolation off, spreadCycle: moves_10.index.gz moves_10.json.gz Current 1amp, 128 microsteps, interpolation off, stealthChop: moves_10_st.json.gz moves_10_st.index.gz Current 0.5amp, 128 microsteps, interpolation off, stealthChop: moves_05_st.index.gz moves_05_st.json.gz Current 1.5amp, 128 microsteps, interpolation off, stealthChop: moves_15_st.json.gz moves_15_st.index.gz

The graphs above were generated using: ./scripts/motan/motan_graph.py moves_10 -s 26 -d 7 -g '[["trapq(toolhead,x_velocity)"], ["deviation(angle(angle_x),stepq(stepper_x))"]]'

The runs are all using this gcode file: moves.gcode.gz

The move.gcode is unchanged from the discussion in PR #4560, so the data files there with various spreadCycle settings should still apply.

I can run additional tests if you have something in mind. This printer (a Voron Zero) has an adxl345 on the toolhead and angle sensors on X, Y, E.

Thanks, -Kevin

You said that you're running a V0 at 128 microsteps, aren't you missing some inbetween microsteps between full steps due to a lack of torque? Or are you running some beefy NEMA17 or?
Because microstepping at such a lower microstepping (128 = ~1%, 64 would be ~2.5%) will net you around 1% of the rated holding torque when running steppers at their full amp capacity.
That's according to some data from what looks like to be a big industrial motor manufacturer I've found on the internet, not sure what's it really worth though, I suspect it's not quite right because of many other parameters I might be skip on, I couldn't find anything else explicit enough.

@ReXT3D
Copy link

ReXT3D commented Dec 12, 2021

In general, the issue with SpreadCycle is that it, technically, requires some tuning.

FWIW, I've tried tuning spreadCycle a few times (on a few different printers) and I've never been successful in measuring a mechanical improvement over the default settings. I suspect that the trinamic docs may be orientated at big industrial motors and common small 3d printer steppers may not require active configuration.

Those are a bit involved and require an oscilloscope, for instance. But I wonder if this tuning can be assisted with a help of an accelerometer and/or an angle sensor?

I did attempt to do just that. Much of the results I've posted here were actually found during my attempt to come up with a mechanism to tune spreadCycle.

Alas, no matter how badly I mis-tune the spreadCycle settings on this particular test printer, I've been unable to measure an induced mechanical jitter. (My goal was to measure mechanical jitter as a proxy for a current probe / oscilloscope.) Neither the angle sensors nor the accelerometer show a strong signal when changing driver_TBL or driver_HEND. If I get further time, I may also try running the results through a FFT to see if that can pick up a signal. However, I suspect the low resistance steppers on this printer may operate fine even at the most aggressive tmc settings.

FWIW, I can measure a strong audio signal using the "spectroid app" on my android phone with varying driver_TOFF. Which was nice to see. However, there is a bit of a "myth" that one can make spreadCycle quiet with tuning. In my experience, if spreadCycle mode is loud with the default settings then it's likely to be loud with all settings. It depends on the type of noise - if one is directly hearing the chopper frequency - which sounds like a pure musical note - almost like a note played on a flute - then it is possible to tune that away. I suspect it is unlikely one would get that with the default settings though. In contrast, if one is hearing a "warbling hissing sound" then it seems that is a "secondary resonance" created by the chopper and I've not seen success in tuning that away. It seems some printer frames/steppers/electronics don't have the issue - for example, my Voron2 is basically silent with spreadCycle mode, while my delta and the makergear m2 were incredibly loud. YMMV.

-Kevin

Thank you for this info Kevin. Coincidentally I recently spent a bit of time trying to fine tune the TMC2209 chopper parameters on my heavily modified CR-10S Pro. The printer is equipped with Moons' 0.9 steppers throughout, except the extruder is currently Moons' 1.8. The MCU is Duet 3 Mini 5+. My observations generally align with yours where the only parameter that made any perceptible difference is driver_TOFF.

I set-up the initial chopper parameters using the TMC2209 spreadsheet with two goals in mind: (1) reducing the idle noise to minimum & (2) reducing or eliminating a rather strong resonance during Y stepper movement that occurs only at approximately 2200 mm/min (36.667 mm/s). Adjustments in driver_TOFF definitely influenced the stand-still noise level and noise "signature" of the steppers and I was able to slightly reduce this noise. Adjusting the other parameters made no difference thus far and I was not yet able to influence the Y resonance in any way.

I have not yet spent any time with an oscilloscope looking at the waveforms since it is an old analogue Tek 2465A and a bit of a challenge to use for this purpose.

@DroneMang
Copy link

DroneMang commented Dec 12, 2021 via email

@KevinOConnor KevinOConnor force-pushed the work-tmctuning-20211128 branch 2 times, most recently from e486646 to 22ab8cc Compare December 12, 2021 17:41
Changing motor current may itself introduce unwanted motor movement.
As such, document that specifying a hold_current is not recommended.

Signed-off-by: Kevin O'Connor <kevin@koconnor.net>
Now that the documentation recommends that hold_current not be set,
update the example config files to not specify a hold_current.

Signed-off-by: Kevin O'Connor <kevin@koconnor.net>
Signed-off-by: Kevin O'Connor <kevin@koconnor.net>
Signed-off-by: Kevin O'Connor <kevin@koconnor.net>
…modes

Signed-off-by: Kevin O'Connor <kevin@koconnor.net>
There is no reason to explicitly set the interpolate flag to true in
the example configs as that is already the default.

Signed-off-by: Kevin O'Connor <kevin@koconnor.net>
…ivers.md

Signed-off-by: Kevin O'Connor <kevin@koconnor.net>
@KevinOConnor KevinOConnor merged commit 323268e into master Dec 12, 2021
@KevinOConnor KevinOConnor deleted the work-tmctuning-20211128 branch December 12, 2021 17:46
@djamu
Copy link

djamu commented Dec 13, 2021

@dmbutyugin

I likely don't fully understand all details here. Do you mean that the StealthChoppers for two coils are not synchronized and can affect each other?

Yes that is correct, it's like crosstalk, chopping frequency / duty cycle is affected by supply voltage and impedance, the on/off time of both coils overlap less/at different times, velocity changes the impedance by back-emf... Long story, synchronised choppers work better IMHO if one doesn't mind some noise....

@KevinOConnor

FWIW, in addition to the sources of error you've pointed out (imbalances in windings, magnets, teeth) I suspect detent forces in the stepper also lead to some of the recurring position errors. (That is, the permanent magnet in the rotor pulls toward the iron teeth in the stator - and that force varies depending on the requested position of the stepper motor.) FWIW, it may be possible to tune the sine wave function on the tmc2130/tmc5160 drivers to account for some of these errors.

-Kevin

Yes of course, detent torque and latent magnetism also play a part.
It is possible to tune the sine wave, problem is, you need a 360° sine not a 90° to map the cyclic offset, I think you can only remap 90 onboard the tmc.

@dmbutyugin @KevinOConnor
Can I suggest some ideas, while giving a solution for your previous question? maybe it's better to put this in a different pull later on.
One can elegantly solve the run_current / hold_current discussion, and make motors with TMC drivers run a lot cooler preciser and generate more torque, and higher steprates... I suggested it to the Marlin guys, but they weren't interested back then, funnily enough Marlin has now a G6 command that lets it work like Klipper https://marlinfw.org/docs/gcode/G006.html
https://reprap.org/wiki/Direct_Stepping
Interesting reading...

This will only work for MCU's/drivers that support hardware SPI (preferably with dma). There's 2 idea's

  1. faster stepping, I have a MCU board prototype that uses SPI + 2 shift registers + 555 for stepping.
    Instead of manipulating pins, this board uses (40Mhz) SPI @ 1Mhz intervals, to continuously shift 2 bytes towards 2 8-bit shift registers, highest bit triggers 555 0.5us delay to reset shift registers so you don't have to un-set any pins, can be easily cascaded. Can pulse well over 1Mhz, but needs an adapter board connected to the MCU SPI bus. Not directly an alternative.

  2. FWIW, it may be possible to tune the sine wave function on the tmc2130/tmc5160 drivers to account for some of these errors.

My favorite ;-)
This is what I did...
Since the TMC drivers ( only SPI ones ) have SPI allready wired up, why not feed them a ( kind of ) sine waveform thru the XDIRECT interface? This has many advantages. Instead of stepping pulses, use those ticks to step a pointer thru a (sinewave) lookup table and feed it to the TMC's..

A. any mechanical / measured motor offset can be compensated by offsetting that pointer with an offset table, ( 2 offset tables, 1 cyclic 1 global ).... but then you need feedback to measure that.... this is closed loop only I'm affraid ( essentially freq modulation )....

B. Similair to pressure advance, one can use a variable which represent mass / torque to offset the same pointer during acc/dec phases, the result is zero lag ( torque related ) as the magnetic field will run in advance of the actual stepper position/acc , nullifying lag when correctly applied, this is easy can be used both in open / closed loop... is a constant multiplied by the current acc factor.
I'm interested how this would work with input shaping....

C. Boosting torque, I use a very high fixed run current of 2.4A on my TMC2130's which is normally enough to both fry your motor and driver but use amplitude modulation of the sine wave to change the actual current.
During movement sine wave is scaled up/down, on acc / dec the coils get the full 2.4A, during non acc/dec phases the waveform is scaled down to appr. a third. Very much like turning the volume down, the boosting only happens for a few ms, TMC drivers are perfectly happy with that, and in general your motors and drivers will stay cool to the touch. ( in my inplementation the wave can be optionally morphed to a square during that phase, creating full wave drive )
This reduces the usable bit range but there's plenty of headroom and won't affect the (relatively small ) actual motor precision, as I showed earlier in the graph, reducing the sinus range to -63/63 still gives 127 positions / coil / step, still plenty...
The actual current is pretty linear with the given amplitude, half value is half current...
One could argue to do the same with the SPI run_current cmd but that is slower, amplitude modulation is almost intanteneous, driver has enough precision...
For my drivers the update rate is 10kHz. .. can be used in open / closed loop.
I'm interested how this would work with input shaping....

D. High speed, during high velocity travels one can up the amplitude to counter for back-emf ( and optionally drive motors with a square wave ( full wave ). Open-loop will never achieve stable 2400RPM @15v but 1600-1800RPM is possible..
Since no-one is interested in travel speeds of 960-1280mm/s ( using a 32 mm pulley ) , the obvious use would be with a (belt) reduction. Using a 4:1 reduction will still give a travel of 240-320mm/s and assuming belts have little backlash, the actual precision and net torque will increase.

This is because cool motors can generate higher torque ( magnets don't like heat ), but ball bearings will wear out faster, have no doubt...
( FYI the TMC's have 2 temperature flags, 1 overheat warning, 1 shutdown. I did a burnin test with a Nema 23 connected to a TMC2130, in the test the amplitude was lowered every ms the overheat flag was on / amplitude upped when it was off, ran it for a week to see if it would survive... no problem... still runs just fine, despite the fact that it smelled like it was melting, must have been paint ;-)

Can't this be implemented using the step/dir interface?, yes sure.... But no not nowhere near as good as with SPI. The acc/dec prediction generates a lot of pulses, you can't just invert polarity on 2 coils on a whim, as you'd have to step thru the table, this generates lag on it's own.... I'm pretty sure you're made well aware of this problem while writing the pressure advance code.... pulse horror / blocking code.....

Started using Klipper a while ago, ( great stuff btw).
Driving TMC's completely with SPI requires little code to implement, I have it already & I was about to port Klipper to a teensy, might as well contribute...

I have some questions though...
Tried to compile klipper with VS ( the normal one not Code ), no joy, threw a lot of errors ( lots of redefinition of yada yada ... ) What IDE are you guys using ?

Klipper code question:
At one point or another you would have made the decision to either do pulses with a -variable timer fixed step size- ( single step ) approach or a -fixed timer variable step size- ( multiple pulses / time frame ) approach. Marlin uses the first, Smoothieware the latter, what approach is klipper using ? The latter would make it a little easier to implement proposed ideas.
( doesn't make much difference actually, I'm pretty sure you have global variable up/down counter somewhere I can poll on regular intervals to use on the lookup table > drive the SPI bus )
It is my personal opinion that the fixed timer burst pulse approach works actually better, as long as timer frequency is (far) beyond motor self resonance, due to a motor's inertia next series of pulses will allready have been transmitted before it even moves.... on my drivers I use a 10kHz update rate. Motors don't care as they react much slower ( neither do they vibrate because of the chopper) ...
The big advantage of this is that MCU's with a costly (slow) interrupt (like the esp32), can be used and still produce high steprates.
But that's my opinion, neither solution is actually better in practice, while the first is in theory far superior ( jitterless )...

For sake of completeness, some graphs to prove my point, how well this all works...
accel for all graphs is 9000mm/s2, sample rate is 10kHz. max velocity in graph is 43.75mm/s ( not all that fast ), lots of changes in velocity, the fragment is taken while printing small text.
The belts where disconnected, this is the graph of an unloaded motor, under load (depending on the tightness of one's belt) it can be much much worse.
(I plotted it unloaded as the algorithms have to work harder, reaction under load is slower / easier to predict.)

blue: is position as received by stepper driver.
red: is actual position
yellow: position of magnetic field ( 3rd and 4th graph ), in graph 1 & 2, this is the same as blue so I left it out.
In graph 3 & 4 this is the offset of the magnetic field relative to the intended position, compensating for the actual position.

This is a fragment of formentioned plot, horizontal is time, vertical is position
motor-normal-abs

If we normalise this to zero, (intended position becomes zero), you see the actual deviation of the motor relative to it's postion.
Again this is an unloaded motor, under load it's much worse.
motor-normal-diff

What you see here is the actual postional error, what's interesting here is the frequency, this is the actual frequency a motor self-resonates ( I've ran it through a FFT tool), you can clearly see the motor swing around it's course, the scale is 32usteps... ( ringing ;-)

Same plots, but with all of the above activated.
abs, yellow is the position of the magnetic field.
motor-closed-loop-abs

dif
motor-closed-loop-dif
the (yellow line) magnetic field swings heavily to compensate for acc/dec... the resulting course barely deviates from the intended course.... (2-3 usteps vs 25...), and I've got perfect prints.... ;-)
Within 0.1 sec it swaps polarity a few times, this is imposible to do with a step/dir interface

I'm interested to see how this would work with input shaping, might aswell conflict or complement as they do basically the same,one modifies the input, the other modifies the ouput... let's see what happens.

The proposed changes will tax ones printer to the max when you up the acc, bearings will wear out much faster, I'm not sure everybody actually want this (9000mm/s2), but it will still work fine with normal settings.

and a stepper motor @ 2500RPM 10/-10 / 100/-100 revolutions (that would be 4 travels of 320,320, 3200, 3200 mm with a 32mm pulley )
https://user-images.githubusercontent.com/29310451/145737391-e6648087-0f01-4ec9-ad80-f880d8b0a0e7.mp4

later
;-)

@KevinOConnor
Copy link
Collaborator Author

Thanks everyone for the feedback. I went ahead and committed this PR as it looks like the core discussions around hold_current and interpolate have been concluded.

There seems to be a bunch of interest in tuning spreadCycle. If I get time I'll try to clean up the test code I've written and start a topic on Klipper Discourse to discuss it further.

-Kevin

@KevinOConnor
Copy link
Collaborator Author

@djamu - interesting. You might want to open a new topic on Klipper Discourse to discuss it, as I suspect it will get more exposure than on this PR.

Some random tips/pointers: Klipper calculates the step times in the host, compresses that schedule of events, transmits it to the micro-controller, and then the micro-controller schedules "software timers" to trigger at each scheduled step time. So, it's not a fixed frequency handler. There's some info in the scheduling logic at https://www.klipper3d.org/Code_Overview.html . In testing of the "mechaduino", I have implemented a fixed-frequency handler in the mcu code - #1038 .

Cheers,
-Kevin

@Gilabite
Copy link

@KevinOConnor any chance if you do decide to clean up the test code and start the tuning topic on Discourse that you could post something here on Github that you did so?

@dmbutyugin
Copy link
Collaborator

FWIW, I've tried tuning spreadCycle a few times (on a few different printers) and I've never been successful in measuring a mechanical improvement over the default settings. I suspect that the trinamic docs may be orientated at big industrial motors and common small 3d printer steppers may not require active configuration.

It might be. SpreadCycle has a few parameters TOFF, HSTRT, HEND, and TBL that control SpreadCycle at low-level. TBL tuning is only useful in case of high capacitance applications. TOFF, HSTRT and HEND control the hysteresis and indirectly the chopper frequency. According to the datasheet, improper settings could lead to reduced microstep accuracy, more power dissipation, and more chopper noise. Probably, the worst that could happen for 3d printing applications is high-pitch noise from steppers if the chopper frequency drops to audible range.

Those are a bit involved and require an oscilloscope, for instance. But I wonder if this tuning can be assisted with a help of an accelerometer and/or an angle sensor?

I did attempt to do just that. Much of the results I've posted here were actually found during my attempt to come up with a mechanism to tune spreadCycle.

Alas, no matter how badly I mis-tune the spreadCycle settings on this particular test printer, I've been unable to measure an induced mechanical jitter. (My goal was to measure mechanical jitter as a proxy for a current probe / oscilloscope.) Neither the angle sensors nor the accelerometer show a strong signal when changing driver_TBL or driver_HEND. If I get further time, I may also try running the results through a FFT to see if that can pick up a signal. However, I suspect the low resistance steppers on this printer may operate fine even at the most aggressive tmc settings.

ADXL345 has a sampling rate of 3200 samples/sec, so it is likely too slow to pick up noise from SpreadCycle chopper. Not sure about angle sensor, but probably deviations are also too small to be noticeable.

It depends on the type of noise - if one is directly hearing the chopper frequency - which sounds like a pure musical note - almost like a note played on a flute - then it is possible to tune that away. I suspect it is unlikely one would get that with the default settings though. In contrast, if one is hearing a "warbling hissing sound" then it seems that is a "secondary resonance" created by the chopper and I've not seen success in tuning that away.

I'm not sure what 'warbling hissing sound' you mean, but if it is high-pitch hissing, I have the following recent experience (though with StealthChop, surprisingly) that I can share. I've installed a different stepper on Y axis recently - 17HS16-2004S1. And in StealthChop mode with TMC5160 it was producing high-pitch hissing at standstill. Not while moving though. When I set PWM_OFS and PWM_GRAD manually (also disabling autograd), that sound went away. So it seems it was some issue with the regulator that would change the chopper parameters all the time, and that effect was audible. In general, in case of SpreadCycle reducing the hysteresis should help with its sound (by increasing the frequency and making it non-audible).

Then there is a sound of the stepper motors themselves (from their stepping), which can be picked up by accelerometer:
stepper_y
This was obtained with a gcode

G0 Z10 X100 Y100 F3000
M400
G4 P500
ACCELEROMETER_MEASURE CHIP=bed
G4 P500
G0 Y95 F300     ; 1 sec,    5 mm/sec
G0 Y105 F600    ; 2 sec,   10 mm/sec
G0 Y90 F900     ; 3 sec,   15 mm/sec
;  ...................................................................
G0 Y0 F11700    ; 38 sec, 195 mm/sec
G0 Y200 F12000  ; 39 sec, 200 mm/sec
M400
G4 P500
ACCELEROMETER_MEASURE CHIP=bed

In the chart we can see the lines that correspond to full stepper steps of 0.2 mm (40 mm rotation distance / 200 (steps/rotation)) that go up to 1000 Hz (200 mm/sec / 0.2 mm), double of full steps (up to 500 Hz), quadruple of full steps (up to 250 Hz), and half of full steps (0.1 mm, partially visible above the 0.2 mm line). There is also a faint line for 2mm GT2 belt (goes up to 100 Hz). These ones can be heard with StealthChop and seem to be magnified with SpreadCycle. And tuning SpreadCycle won't reduce this noise.

TBH, I was a bit surprised to see the frequency of quad steps (corresponds to 0.8 mm distance), but it may be from some peculiarities of hybrid stepper motors construction.

@Tahx

Because microstepping at such a lower microstepping (128 = ~1%, 64 would be ~2.5%) will net you around 1% of the rated holding torque when running steppers at their full amp capacity.

I think that's largely a myth: there could be some small dependency of the max torque when changing microstepping, but nothing that dramatic. FWIW, after the recent optimizations with the step scheduling in Klipper, I've been running my Ender 3 Pro at 256 microstepping on all axes, and moving bed did not give me any issues or missed steps.

@djamu

FWIW, I've previously tried to use TMC5160 sine table to reduce the noise from the stepper motors. I've had some relatively promising results. The problem was generally a lack of good data to tune the 'sine' function, so I made a test to move the axis at some constant velocity, and then used accelerometer data to iteratively try to optimize that function in different parts and choose the better candidates that would reduce the amplitude of vibrations at specific frequencies (dependent on the test velocity). Unfortunately, that would take hours to converge. But the end result was noticeable noise reduction in a wide range of speeds, though not on all speeds. I didn't send a PR or publish it for testing at the time though.

I agree that the sine table can only do so much to reduce the vibrations and increase the precision of microstepping. It also is available only on TMC2130 and TMC5160. But with recent Klipper improvements in microstepping, we could also do a similar adjustment differently and for all TMC drivers. Currently, the steps Klipper sends are 'equidistant'. But we could change that by adding a function for the stepper steps calculation that would add non-linearity to the stepper steps. At high microstepping resolution (128, 256 msteps) it will have an effect similar to adjusting the sine table, but we would not be limited by specifying 90 degrees of the sine wave: we could make it span several full steps, or even 2mm GT2 pitch. Also, with much improved timing of adxl345 measurements now we could use it to accurately map acceleration readings to the microsteps. Then, by measuring a constant speed movement over a long distance, averaging acceleration and mapping it to the microsteps, we would get a useful signal to adjust the nonlinearity function to reduce the vibrations.

All I am saying is that it may not be necessary to program stepping over SPI, and use the already available tooling and standard dir/step interface to achieve the effect with most TMC drivers.

Separately, we should indeed move this conversation to Discourse :)

OXERY added a commit to Prutsium/3D-Druckerplausch-Klipper that referenced this pull request Jan 5, 2022
@KevinOConnor
Copy link
Collaborator Author

FYI, I did not get a chance to update my TMC tuning branch. However, I have published it (in all its broken glory). There's more info at https://klipper.discourse.group/t/motion-analysis-by-stepper-phase/1876

-Kevin

@github-actions github-actions bot locked and limited conversation to collaborators Jan 25, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet