Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSD task stalling #11333

Open
hydra opened this issue Jan 24, 2022 · 40 comments
Open

OSD task stalling #11333

hydra opened this issue Jan 24, 2022 · 40 comments
Labels
BUG Bugs are excluded from automatically being marked as stale

Comments

@hydra
Copy link
Contributor

hydra commented Jan 24, 2022

Describe the bug

The OSD task stalls, no further updates are performed.

To Reproduce

Leave it running. I've encountered the failure encountered multiple times after a long uptime. since there's no place you can place a breakpoint the exact time it fails is currently unknown.

Expected behavior

OSD should never stall.

Flight controller configuration

# version
# Betaflight / SPRACINGH7RF (SP7R) 4.3.0 Jan 23 2022 / 20:06:02 (2f2728f178) MSP API: 1.44

# start the command batch
batch start

board_name SPRACINGH7RF

# resources
resource SPI_SCK 3 NONE
resource SPI_MISO 3 NONE
resource SPI_MOSI 3 NONE

# map
map TAER1234

# aux
aux 0 0 0 1700 2100 0 0
aux 1 40 0 1300 1700 0 0

# master
set acc_calibration = 6,8,-55,1
set motor_pwm_protocol = DSHOT600
set osd_vbat_pos = 2443
set osd_rssi_pos = 353
set osd_link_quality_pos = 2369
set osd_rssi_dbm_pos = 2401
set osd_tim_1_pos = 2433
set osd_ah_sbar_pos = 2254
set osd_ah_pos = 2126
set osd_current_pos = 2411
set osd_craft_name_pos = 2121
set osd_altitude_pos = 2421
set osd_debug_pos = 416
set osd_core_temp_pos = 2456
set osd_framerate_hz = 60
set debug_mode = RX_EXPRESSLRS_SPI
set pinio_box = 40,255,255,255
#set expresslrs_uid = #private#
set scheduler_relax_rx = 20

profile 0

rateprofile 0

# end the command batch
batch end

Flight controller

SPRacingH7RF

Other components

N/A

How are the different components wired up

N/A

Add any other context about the problem that you think might be relevant here

The issue seems to be in this block:

bool osdUpdateCheck(timeUs_t currentTimeUs, timeDelta_t currentDeltaTimeUs)
{
    UNUSED(currentDeltaTimeUs);
    static timeUs_t osdUpdateDueUs = 0;

    if (osdState == OSD_STATE_IDLE) {
        // If the OSD is due a refresh, mark that as being the case
        if (cmpTimeUs(currentTimeUs, osdUpdateDueUs) > 0) {
            osdState = OSD_STATE_CHECK;  <-- NEVER EXECUTED.

Here's the OSD task, after not being executed for a long time.

Screenshot 2022-01-24 010144

Compare the lastExecutedAtUs above to the one from the RX task, below.

Screenshot 2022-01-24 011323

As you can see, the OSD task stalled, and hasn't been run for a long time.

After resetting the value of the register that contains osdUpdateDueUs to 0 prior to it being checked the task resumed OK.

After resuming the task I dropped to the cli and created the diff above and dumped the current status, as below:

# tasks
Task list             rate/hz  max/us  avg/us maxload avgload  total/ms   late    run reqd/us
00 - (         SYSTEM)     10      37       8    0.0%    0.0%       411     83 176531       8
01 - (         SYSTEM)    955      69      16    6.5%    1.5%     45973   7301 16958040      16
02 - (           GYRO)   7955      74      17   58.8%   13.5%   1550115      0 140570969       0
03 - (         FILTER)   7955      64      17   50.9%   13.5%    946625      0 140570969       0
04 - (            PID)   7955     192      72  152.7%   57.2%   2767119      0 140570969       0
05 - (            ACC)    947      89      32    8.4%    3.0%    372960  30200 16177572      32
06 - (       ATTITUDE)     99      95      62    0.9%    0.6%     65452   1397 1737051      62
07 - (             RX)    250     157      10    3.9%    0.2%    464921 132182 17683348      40
08 - (         SERIAL)     98     259      46    2.5%    0.4%     88769      0 1372642      46
09 - (       DISPATCH)    968      50       3    4.8%    0.2%      5903   9045 17132924       3
10 - (BATTERY_VOLTAGE)     50      48      14    0.2%    0.0%      5265    464 881427      14
11 - (BATTERY_CURRENT)     50      50      13    0.2%    0.0%      3522    438 881370      13
12 - ( BATTERY_ALERTS)      5      47      37    0.0%    0.0%       847     52  88282      37
13 - (         BEEPER)     98      56       9    0.5%    0.0%      1880   1685 1372642       9
16 - (           BARO)     25     105      94    0.2%    0.2%     32356 344441 1345117      94
17 - (       ALTITUDE)     29     106      48    0.3%    0.1%     22551 154670 510216      48
19 - (      TELEMETRY)    198      80       4    1.5%    0.0%     35518  56471 3183307       4
20 - (       LEDSTRIP)     75     106      37    0.7%    0.2%      9774  94229 1220441      15
21 - (            OSD)     25     302      52    0.7%    0.1%    466890 3283574 58066296      58
22 - (PIXEL_OSD_VIDEO)     60      33       2    0.1%    0.0%       396    772 1059729       2
24 - (            CMS)     10      46      12    0.0%    0.0%      1227    389 190263      12
25 - (        VTXCTRL)      2      36       7    0.0%    0.0%       190     69  49381       7
26 - (        CAMCTRL)      2      31       4    0.0%    0.0%        64     75  58881       4
28 - (    ADCINTERNAL)      0      38      20    0.0%    0.0%        73     44  10009      20
29 - (       PINIOBOX)      7      50      14    0.0%    0.0%       987    793 152349      14
RX Check Function                2099      11                    861682
Total (excluding SERIAL)                                90.3%

# status
MCU H730 Clock=520MHz, Vref=3.27V, Core temp=63degC
Stack size: 2048, Stack address: 0x2000f000
Configuration: CONFIGURED, size: 3944, max available: 65536
Devices detected: SPI:1, I2C:1
Gyros detected: gyro 1 locked
GYRO=ICM42605, ACC=ICM42605, BARO=BMP388
OSD: SPRACINGOSD
System Uptime: 17673 seconds, Current Time: 2022-01-24T00:02:56.361+00:00
CPU:37%, cycle time: 124, GYRO rate: 8064, RX rate: 251, System rate: 9
Voltage: 1187 * 0.01V (3S battery - OK)
I2C Errors: 3
SD card: Manufacturer 0x27, 30289920kB, 01/2005, v6.0, 'SD32G'
Filesystem: Ready
Arming disable flags: CLI MSP

That uptime is a 4hrs:09mins. the RX link was up all the time.

@SteveCEvans
Copy link
Member

I've left a MATEKF722SE running overnight and the OSD is still running. Looks like this may be painful to reproduce.

# status
MCU F722 Clock=216MHz, Vref=3.28V, Core temp=47degC
Stack size: 2048, Stack address: 0x20010000
Configuration: CONFIGURED, size: 4028, max available: 16384
Devices detected: SPI:2, I2C:1
Gyros detected: gyro 1, gyro 2 locked dma shared
GYRO=MPU6000, ACC=MPU6000, BARO=BMP280
OSD: MAX7456
System Uptime: 20915 seconds, Current Time: 2022-01-24T07:59:42.290+00:00
CPU:31%, cycle time: 125, GYRO rate: 8000, RX rate: 15, System rate: 9
Voltage: 1 * 0.01V (0S battery - NOT PRESENT)
I2C Errors: 1
SD card: Startup failed
Arming disable flags: RXLOSS CLI MSP

@hydra
Copy link
Contributor Author

hydra commented Jan 24, 2022

yeah. :( When it happens again I can try and get more details from the running system. Can you think of anything that would be useful to look at?

@hydra
Copy link
Contributor Author

hydra commented Jan 24, 2022

It happened again:

image

image

image

rx task for reference values:
image

# status
MCU H730 Clock=520MHz, Vref=3.28V, Core temp=58degC
Stack size: 2048, Stack address: 0x2000f000
Configuration: CONFIGURED, size: 3944, max available: 65536
Devices detected: SPI:1, I2C:1
Gyros detected: gyro 1 locked
GYRO=ICM42605, ACC=ICM42605, BARO=BMP388
OSD: SPRACINGOSD
System Uptime: 5460 seconds, Current Time: 2022-01-24T14:37:25.055+00:00
CPU:28%, cycle time: 126, GYRO rate: 7936, RX rate: 15, System rate: 9
Voltage: 1189 * 0.01V (3S battery - OK)
I2C Errors: 3
SD card: Manufacturer 0x27, 30289920kB, 01/2005, v6.0, 'SD32G'
Filesystem: Ready
Arming disable flags: RXLOSS CLI MSP

0xFFFFFFFF (uS) / 1000 / 1000 / 60 = 71.582 minutes, I expect there's a rollover issue.

The new OSD code needs to be further reviewed to find out how/where this is happening.

@hydra
Copy link
Contributor Author

hydra commented Jan 24, 2022

@SteveCEvans Did you test with set osd_framerate_hz = 60?

@hydra
Copy link
Contributor Author

hydra commented Jan 24, 2022

It seems quite consistent here, I set a stopwatch for 70 minutes, monitored it, and at 71 minutes it stalled again.

0xffffffff = 4294967295 / 1000 / 1000 = 4294 seconds, and it died just before that according to the status output of the still-running system.

System Uptime: 4271 seconds, Current Time: 2022-01-24T15:54:42.063+00:00

@SteveCEvans
Copy link
Member

Still going strong with a fairly busy display.

# status
MCU F722 Clock=216MHz, Vref=3.28V, Core temp=50degC
Stack size: 2048, Stack address: 0x20010000
Configuration: CONFIGURED, size: 4028, max available: 16384
Devices detected: SPI:2, I2C:1
Gyros detected: gyro 1, gyro 2 locked dma shared
GYRO=MPU6000, ACC=MPU6000, BARO=BMP280
OSD: MAX7456
System Uptime: 54108 seconds, Current Time: 2022-01-24T17:12:54.879+00:00
CPU:32%, cycle time: 125, GYRO rate: 8000, RX rate: 15, System rate: 9
Voltage: 1 * 0.01V (0S battery - NOT PRESENT)
I2C Errors: 1
SD card: Startup failed
Arming disable flags: RXLOSS CLI MSP

image

Some stats at this point. I note that we've used most of the available groups. If we run out of groups then the last group would end up containing the remaining elements. This will result in a long group duration estimate. I'll raise a PR to increase this.

image
image
image
image

@SteveCEvans
Copy link
Member

I'm testing with a osd_framerate_hz = 12 as that's the same rate at 4.2. That would perhaps explain why you see failure after 4 hours at 5 times the rate I'm testing at.

@hydra
Copy link
Contributor Author

hydra commented Jan 24, 2022

I was getting the failure after just 71 minutes earlier today, the 4 hour test yesterday may have been 71 minutes too, except that I didn't check it till 4 hours after I booted it.

I'm fairly sure there's nothing wrong with the OSD code itself, since osdUpdateDueUs cannot be modified outside osdUpdateCheck and it's only ever incremented, but I have a feeling that the under roll-over conditions in the scheduler that the osdUpdateCheck method isn't being called, and then it still contains an old value for osdUpdateDueUs which is close to the roll-over value.

I don't think the osd_framerate_hz makes any difference either.

This can help explain why it doesn't always do it too, as it depends on the scheduler code.

@hydra
Copy link
Contributor Author

hydra commented Jan 24, 2022

Yeah, pretty sure it's overflow related in the scheduler now, it just did it again around 12968 seconds which is a multiple of the rollover seconds (3x).

System Uptime: 12968 seconds, Current Time: 2022-01-24T19:37:00.510+00:00

@SteveCEvans
Copy link
Member

Not happening here after 44147 seconds. OSD still updating. Can you reproduce this on another FC with a non-PixelOSD display?

# status
MCU H743 (Rev.V) Clock=480MHz, Vref=3.30V, Core temp=53degC
Stack size: 2048, Stack address: 0x2001fff8
Configuration: CONFIGURED, size: 3941, max available: 131072
Devices detected: SPI:2, I2C:1
Gyros detected: gyro 1, gyro 2 locked dma
GYRO=MPU6000, ACC=MPU6000, BARO=DPS310
OSD: MAX7456
System Uptime: 44147 seconds, Current Time: 2022-01-27T12:08:39.635+00:00
CPU:19%, cycle time: 125, GYRO rate: 8000, RX rate: 15, System rate: 9
Voltage: 0 * 0.01V (0S battery - NOT PRESENT)
I2C Errors: 5
SD card: Manufacturer 0x3, 31166976kB, 05/2005, v8.0, 'SC32G'
Filesystem: Ready
Arming disable flags: RXLOSS CLI MSP RPMFILTER
# tasks
Task list             rate/hz  max/us  avg/us maxload avgload  total/ms   late    run reqd/us
00 - (         SYSTEM)     10       8       0    0.0%    0.0%       179     24 442378       0
01 - (         SYSTEM)    999      14       2    1.3%    0.1%     40666   2211 44173616       2
02 - (           GYRO)   7987      12       1    9.5%    0.7%    354667      0 353522934       0
03 - (         FILTER)   7987      17       6   13.5%    4.7%   1894325      0 353522934       0
04 - (            PID)   7987      32      19   25.5%   15.1%   1965891      0 353522934       0
05 - (            ACC)    999      14       1    1.3%    0.0%     53718   2237 44172550       1
06 - (       ATTITUDE)    100      13       4    0.1%    0.0%     15309    332 4419034       4
07 - (             RX)     15      26       1    0.0%    0.0%     17137    271 2653088       7
08 - (         SERIAL)    100  694450       1 6944.5%    0.0%     71453      0 3995758       1
09 - (       DISPATCH)    999      12       0    1.1%    0.0%     16319   1956 44174245       1
10 - (BATTERY_VOLTAGE)     50      10       2    0.0%    0.0%      2639     96 2209521       1
11 - (BATTERY_CURRENT)     50       9       0    0.0%    0.0%       903     73 2209521       1
12 - ( BATTERY_ALERTS)      5       8       2    0.0%    0.0%       180      8 221196       2
13 - (         BEEPER)    100      11       0    0.1%    0.0%      2696    889 3995757       1
16 - (           BARO)     40      38      38    0.1%    0.1%     29170      5 5110261       6
17 - (       ALTITUDE)     40      15       4    0.0%    0.0%      6220    279 1700162       3
19 - (      TELEMETRY)    250      14       1    0.3%    0.0%     14352    731 10523736       1
21 - (            OSD)     12      45      14    0.0%    0.0%     80229     50 10444070      46
23 - (            CMS)     20       9       1    0.0%    0.0%       644    103 852097       1
24 - (        VTXCTRL)      5       7       2    0.0%    0.0%       113     13 220717       2
25 - (        CAMCTRL)      5      10       1    0.0%    0.0%        94     10 220732       1
27 - (    ADCINTERNAL)      1       5       1    0.0%    0.0%        44      2  44221       1
RX Check Function                  18       3                     39915
Total (excluding SERIAL)                                20.7%

@hydra
Copy link
Contributor Author

hydra commented Jan 28, 2022

Still happening here, on a more recent build with some other PR's merged and fixed.

# tasks
Task list             rate/hz  max/us  avg/us maxload avgload  total/ms   late    run reqd/us
00 - (         SYSTEM)     10      43       7    0.0%    0.0%       686    205 308503       7
01 - (         SYSTEM)    994      64       5    6.3%    0.4%     31708   9481 30221790       5
02 - (           GYRO)   8013      80      15   64.1%   12.0%   2543754      0 244951898       0
03 - (         FILTER)   8013      77      15   61.7%   12.0%   1471988      0 244951898       0
04 - (            PID)   8013     172      47  137.8%   37.6%    196484      0 244951898       0
05 - (            ACC)    988     103      28   10.1%    2.7%    680750  30352 29555010      28
06 - (       ATTITUDE)    100      99      36    0.9%    0.3%    102726   3404 3042775      36
07 - (             RX)     15     125      10    0.1%    0.0%    648041 286391 28808644      57
08 - (         SERIAL)     98     271      76    2.6%    0.7%    134250      0 2610117      76
09 - (       DISPATCH)    995      44       3    4.3%    0.2%      8403  50674 30310213       3
10 - (BATTERY_VOLTAGE)     50      55      11    0.2%    0.0%      7434    830 1539396      11
11 - (BATTERY_CURRENT)     50      43       3    0.2%    0.0%      1524   1563 1539354       3
12 - ( BATTERY_ALERTS)      5      48      32    0.0%    0.0%      1068     95 154244      32
13 - (         BEEPER)     98      54       4    0.5%    0.0%      7080   5888 2610116       4
16 - (           BARO)     25     105      97    0.2%    0.2%     55909 831090 2323771      23
17 - (       ALTITUDE)     30     104      32    0.3%    0.0%     35530 160307 902060      32
19 - (      TELEMETRY)    196      65       1    1.2%    0.0%     47132  88665 5659644       1
20 - (       LEDSTRIP)     75     115      17    0.8%    0.1%     23516 193623 2224894      17
21 - (            OSD)     42     924      18    3.8%    0.0%   1465534 9797861 579862266       4
22 - (PIXEL_OSD_VIDEO)     60      45       5    0.2%    0.0%       687   2654 1852754       5
24 - (            CMS)     12      51      28    0.0%    0.0%      1773    331 384061      28
25 - (        VTXCTRL)      5      53       6    0.0%    0.0%       497     71 140928       6
26 - (        CAMCTRL)      5      38       2    0.0%    0.0%       115     90 141576       2
28 - (    ADCINTERNAL)      1      47      20    0.0%    0.0%       197     22  30346      20
29 - (       PINIOBOX)     12      53      17    0.0%    0.0%      2550    415 383790      17
RX Check Function                3128       1                   2998289
Total (excluding SERIAL)                                65.5%

# tasks
Task list             rate/hz  max/us  avg/us maxload avgload  total/ms   late    run reqd/us
00 - (         SYSTEM)     10       1       6    0.0%    0.0%       687      0     78       6
01 - (         SYSTEM)    989       9       5    0.8%    0.4%     31713      3   7812       5
02 - (           GYRO)   8006      23      16   18.4%   12.8%   2544381      0  62916       0
03 - (         FILTER)   8006      23      14   18.4%   11.2%   1472319      0  62916       0
04 - (            PID)   8006      60      46   48.0%   36.8%    197420      0  62916       0
05 - (            ACC)    987      39      32    3.8%    3.1%    680917      4   7764      32
06 - (       ATTITUDE)    100      38      34    0.3%    0.3%    102739      1    784      34
07 - (             RX)     15      59       9    0.0%    0.0%    648055      1    472      55
08 - (         SERIAL)    100   16023      42  160.2%    0.4%    134270      0    783      42
09 - (       DISPATCH)    993       6       3    0.5%    0.2%      8405      9   7816       3
10 - (BATTERY_VOLTAGE)     50       7       8    0.0%    0.0%      7435      0    393       8
11 - (BATTERY_CURRENT)     50       4       1    0.0%    0.0%      1524      0    393       1
12 - ( BATTERY_ALERTS)      5       9      32    0.0%    0.0%      1068      0     40      32
13 - (         BEEPER)    100       9       6    0.0%    0.0%      7081      0    783       6
16 - (           BARO)     25      44      97    0.1%    0.2%     55921    196    582      23
17 - (       ALTITUDE)     30      42      38    0.1%    0.1%     35535     30    238      38
19 - (      TELEMETRY)    194       4       2    0.0%    0.0%     47132      1   1511       2
20 - (       LEDSTRIP)     74      42      28    0.3%    0.2%     23520     47    584      19
21 - (            OSD)     42       0      18    0.0%    0.0%   1465534      0      0       4
22 - (PIXEL_OSD_VIDEO)     60       5       2    0.0%    0.0%       687      0    472       2
24 - (            CMS)     12      17      27    0.0%    0.0%      1774      0     97      27
25 - (        VTXCTRL)      5       6       6    0.0%    0.0%       497      0     39       6
26 - (        CAMCTRL)      5       1       2    0.0%    0.0%       115      0     39       2
28 - (    ADCINTERNAL)      1       8      20    0.0%    0.0%       197      0      7      20
29 - (       PINIOBOX)     12      19      18    0.0%    0.0%      2551      0     97      18
RX Check Function                   6       1                   2998290
Total (excluding SERIAL)                                65.3%


# status
MCU H730 Clock=520MHz, Vref=3.28V, Core temp=49degC
Stack size: 2048, Stack address: 0x2000f000
Configuration: CONFIGURED, size: 3944, max available: 65536
Devices detected: SPI:1, I2C:1
Gyros detected: gyro 1 locked
GYRO=ICM42605, ACC=ICM42605, BARO=BMP388
OSD: SPRACINGOSD
System Uptime: 30901 seconds, Current Time: 2022-01-28T00:05:35.925+00:00
CPU:27%, cycle time: 125, GYRO rate: 8000, RX rate: 1052, System rate: 9
Voltage: 74 * 0.01V (0S battery - NOT PRESENT)
I2C Errors: 3
SD card: Manufacturer 0x27, 30289920kB, 01/2005, v6.0, 'SD32G'
Filesystem: Ready
Arming disable flags: RXLOSS CLI MSP

# version
# Betaflight / SPRACINGH7RF (SP7R) 4.3.0 Jan 27 2022 / 16:22:01 (7bc3b34810) MSP API: 1.44

https://github.com/spracing/betaflight/commits/bf-h7rf-wip-6

@hydra
Copy link
Contributor Author

hydra commented Jan 28, 2022

Not happening here after 44147 seconds. OSD still updating. Can you reproduce this on another FC with a non-PixelOSD display?

I've not tried, but as noted via slack it should be possible to hard-code the osd states, osd group memberships and element durations in a test build or unit test for the scheduler to provoke the same behavior.

I'm testing to see if the changes in #11354 help.

@hydra
Copy link
Contributor Author

hydra commented Jan 30, 2022

I've been running the PR from #11354 and so far so good.

# status
MCU H730 Clock=520MHz, Vref=3.28V, Core temp=48degC
Stack size: 2048, Stack address: 0x2000f000
Configuration: CONFIGURED, size: 3944, max available: 65536
Devices detected: SPI:1, I2C:1
Gyros detected: gyro 1 locked
GYRO=ICM42605, ACC=ICM42605, BARO=BMP388
OSD: SPRACINGOSD
System Uptime: 73160 seconds, Current Time: 2022-01-30T12:49:43.098+00:00
CPU:41%, cycle time: 128, GYRO rate: 7812, RX rate: 15, System rate: 9
Voltage: 71 * 0.01V (0S battery - NOT PRESENT)
I2C Errors: 3
SD card: Manufacturer 0x27, 30289920kB, 01/2005, v6.0, 'SD32G'
Filesystem: Ready
Arming disable flags: RXLOSS CLI MSP

# tasks
Task list             rate/hz  max/us  avg/us maxload avgload  total/ms   late    run reqd/us
00 - (         SYSTEM)     10      25       3    0.0%    0.0%       343   1109 731435       3
01 - (         SYSTEM)    985      62       7    6.1%    0.6%    185540  33249 72066052       7
02 - (           GYRO)   8019      56      18   44.9%   14.4%   2045908      0 577946580       0
03 - (         FILTER)   8019      61      13   48.9%   10.4%   3313299      0 577946580       0
04 - (            PID)   8019     165      52  132.3%   41.6%   2031216      0 577946580       0
05 - (            ACC)    981      68      27    6.6%    2.6%   1343130  37475 71443373      27
06 - (       ATTITUDE)     97      85      73    0.8%    0.7%    160474   5784 7275453      73
07 - (             RX)     15     123      14    0.1%    0.0%    177132  22945 4376436      66
08 - (         SERIAL)     99     636      60    6.2%    0.5%     43477      0 7278095      60
09 - (       DISPATCH)    985      34       7    3.3%    0.6%     23945 150159 72137829       7
10 - (BATTERY_VOLTAGE)     50      40      11    0.2%    0.0%     13941   1946 3652679      11
11 - (BATTERY_CURRENT)     50      34       3    0.1%    0.0%      1999   3460 3653273       3
12 - ( BATTERY_ALERTS)      5      31      13    0.0%    0.0%      1922    221 365758      13
13 - (         BEEPER)     99      37       8    0.3%    0.0%     10669   5211 7278019       8
16 - (           BARO)     28      93      92    0.2%    0.2%    130942     62 6234553      92
17 - (       ALTITUDE)     40      85      46    0.3%    0.1%    107003   2566 2913413      46
19 - (      TELEMETRY)    249      46       6    1.1%    0.1%    100317  15839 18148999       6
20 - (       LEDSTRIP)     99      97      55    0.9%    0.5%     51916   2219 7224929      55
21 - (            OSD)     60    1016      26    6.0%    0.1%   3400485 42711267 326803326       2
22 - (PIXEL_OSD_VIDEO)     60      23       2    0.1%    0.0%      1596   9604 4390070       2
24 - (            CMS)     20      31       8    0.0%    0.0%      6027   1395 1457170       8
25 - (        VTXCTRL)      5      19       7    0.0%    0.0%      1262    396 364630       7
26 - (        CAMCTRL)      5      26       4    0.0%    0.0%       702    463 364644       4
28 - (    ADCINTERNAL)      1      32       8    0.0%    0.0%       300     81  72946       8
29 - (       PINIOBOX)     20      32      11    0.0%    0.0%      5275   1403 1457039      11
RX Check Function                2670       4                   1215691
Total (excluding SERIAL)                                71.9%

# tasks
Task list             rate/hz  max/us  avg/us maxload avgload  total/ms   late    run reqd/us
00 - (         SYSTEM)     10       5       4    0.0%    0.0%       343      1    116       4
01 - (         SYSTEM)    988      17       7    1.6%    0.6%    185570      3  11412       7
02 - (           GYRO)   8000      31      18   24.8%   14.4%   2046915      0  91802       0
03 - (         FILTER)   8000      23      13   18.4%   10.4%   3313812      0  91802       0
04 - (            PID)   8000      61      48   48.8%   38.4%   2032918      0  91802       0
05 - (            ACC)    981      37      26    3.6%    2.5%   1343344      6  11314      26
06 - (       ATTITUDE)     99      43      50    0.4%    0.4%    160500      0   1150      50
07 - (             RX)     15      80      11    0.1%    0.0%    177159      3    692      67
08 - (         SERIAL)     99   19525      57  193.2%    0.5%     43503      0   1148      57
09 - (       DISPATCH)    988      10       2    0.9%    0.1%     23948     16  11422       2
10 - (BATTERY_VOLTAGE)     50      13      10    0.0%    0.0%     13944      1    579      10
11 - (BATTERY_CURRENT)     50       7       6    0.0%    0.0%      2000      0    579       6
12 - ( BATTERY_ALERTS)      5      17      16    0.0%    0.0%      1923      1     58      16
13 - (         BEEPER)     99      10       7    0.0%    0.0%     10670      2   1148       7
16 - (           BARO)     27      56      92    0.1%    0.2%    130962      0    954      92
17 - (       ALTITUDE)     40      51      50    0.2%    0.2%    107018      1    462      50
19 - (      TELEMETRY)    248       5       2    0.1%    0.0%    100317      3   2882       2
20 - (       LEDSTRIP)     99      56      55    0.5%    0.5%     51924      0   1147      55
21 - (            OSD)     60     178       6    1.0%    0.0%   3401717   7060 722456       6
22 - (PIXEL_OSD_VIDEO)     60       6       5    0.0%    0.0%      1596      2    696       5
24 - (            CMS)     20      10       8    0.0%    0.0%      6028      0    231       8
25 - (        VTXCTRL)      5       5       7    0.0%    0.0%      1262      0     58       7
26 - (        CAMCTRL)      5       5       4    0.0%    0.0%       702      0     58       4
28 - (    ADCINTERNAL)      1       5       8    0.0%    0.0%       300      0     12       8
29 - (       PINIOBOX)     20       9      10    0.0%    0.0%      5275      0    231      10
RX Check Function                2621       4                   1218579
Total (excluding SERIAL)                                67.7%

@github-actions
Copy link

github-actions bot commented Mar 2, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs within a week.

@github-actions github-actions bot added the Inactive Automatically detected and labeled, will be closed after another week of inactivity. label Mar 2, 2022
@hydra
Copy link
Contributor Author

hydra commented Mar 2, 2022

Requires re-verification in light of recent changes to master. My branch where this seen has other latency issues caused by merges of recent PR's into master which are blocking re-verification.

As I understand it the root cause has not been found yet and since the code is isolated to the OSD task and that code hasn't changed the finger is being pointed at recent scheduler changes as being the cause.

@github-actions github-actions bot removed the Inactive Automatically detected and labeled, will be closed after another week of inactivity. label Mar 3, 2022
@hydra
Copy link
Contributor Author

hydra commented Mar 7, 2022

Blocked, waiting on the CPU load issue identified on #11380 (comment) to be fixed first before retesting.

@hydra
Copy link
Contributor Author

hydra commented Mar 23, 2022

This re-occurred today while testing #11483

image

Binaries and code for testing here: #11483 (comment)

USB still functional. Note the uptime below (4533 seconds = 75 minutes) vs the current OSD output On Time value (45 minutes).

# status
MCU H730 Clock=520MHz, Vref=3.27V, Core temp=59degC
Stack size: 2048, Stack address: 0x2000f000
Configuration: CONFIGURED, size: 3946, max available: 65536
Devices detected: SPI:1, I2C:1
Gyros detected: gyro 1 locked
GYRO=ICM42605, ACC=ICM42605, BARO=BMP388
OSD: SPRACINGOSD
System Uptime: 4533 seconds, Current Time: 2022-03-23T14:29:09.935+00:00
CPU:50%, cycle time: 117, GYRO rate: 8547, RX rate: 500, System rate: 9
Voltage: 81 * 0.01V (0S battery - NOT PRESENT)
I2C Errors: 3
SD card: Manufacturer 0x27, 30289920kB, 01/2005, v6.0, 'SD32G'
Filesystem: Ready
Arming disable flags: CLI MSP RPMFILTER

# version
# Betaflight / SPRACINGH7RF (SP7R) 4.3.0 Mar 23 2022 / 13:57:17 (7cb20d035a) MSP API: 1.44

@hydra
Copy link
Contributor Author

hydra commented Mar 23, 2022

And even more weirdly, without rebooting the FC the OSD magically resumed by itself:

image

# version
# Betaflight / SPRACINGH7RF (SP7R) 4.3.0 Mar 23 2022 / 13:57:17 (7cb20d035a) MSP API: 1.44

# status
MCU H730 Clock=520MHz, Vref=3.27V, Core temp=60degC
Stack size: 2048, Stack address: 0x2000f000
Configuration: CONFIGURED, size: 3946, max available: 65536
Devices detected: SPI:1, I2C:1
Gyros detected: gyro 1 locked
GYRO=ICM42605, ACC=ICM42605, BARO=BMP388
OSD: SPRACINGOSD
System Uptime: 5008 seconds, Current Time: 2022-03-23T14:37:05.766+00:00
CPU:54%, cycle time: 126, GYRO rate: 7936, RX rate: 502, System rate: 9
Voltage: 82 * 0.01V (0S battery - NOT PRESENT)
I2C Errors: 3
SD card: Manufacturer 0x27, 30289920kB, 01/2005, v6.0, 'SD32G'
Filesystem: Ready
Arming disable flags: CLI MSP RPMFILTER

image

Further indicating a scheduling / roll-over bug in the scheduler. Likely as some comparison resulted in true again after some overflow occurred.

@haslinghuis
Copy link
Member

Please test with latest firmware on non-PixelOSD display hardware

@hydra hydra mentioned this issue Apr 5, 2022
@hydra
Copy link
Contributor Author

hydra commented Apr 6, 2022

For those following, the fix is likely this:

#11459 (review)

For now I'm testing on hardware which is more likely to be affected by the issue that was fixed due to the clock-speed of the system, which on the H7RF is 520Mhz.

@hydra
Copy link
Contributor Author

hydra commented Apr 7, 2022

The bug in the scheduler, (fixed via #11459 (review)) appears to be the cause, I've not seen the task stall after about 12 hours of monitoring today. I will continue to monitor it and will update this issue after I'm satisfied it's no-longer occurring.

@hydra
Copy link
Contributor Author

hydra commented Apr 7, 2022

Typical! Just a few minutes after writing my last comment I see it's stalled again... Still something causing the issue!

Uptime: 11301
Frozen 'On Min' value on OSD: 31:57.42

While it's frozen you can see the 'run' count for the OSD is '0'.

# tasks
Task list             rate/hz  max/us  avg/us maxload avgload  total/ms   late    run reqd/us
00 - (         SYSTEM)     10      12       2    0.0%    0.0%       358      0    140      29
01 - (         SYSTEM)    963      46       5    4.4%    0.5%     56802     16  13488      41
02 - (           GYRO)   8019      54       8   43.3%    7.1%    852441      0 112342       0
03 - (         FILTER)   8019      73       7   58.5%    6.0%    715687      0 112342       0
04 - (            PID)   8019     137      30  109.8%   24.2%   2514213      0 112342       0
05 - (            ACC)    844      58      13    4.8%    1.1%    128914     20  13199      55
06 - (       ATTITUDE)     90     121      27    1.0%    0.2%     30688     11   1318      72
07 - (             RX)    499      63      36    3.1%    1.8%    282870     44  20718      57
08 - (         SERIAL)    100   16539       6  165.3%    0.0%     53025      0   1393      40
09 - (       DISPATCH)    988      26       0    2.5%    0.0%      3793     32  13836       9
10 - (BATTERY_VOLTAGE)     50      41       5    0.2%    0.0%      4872      1    700      39
11 - (BATTERY_CURRENT)     50      11       2    0.0%    0.0%      1009      0    700      23
12 - ( BATTERY_ALERTS)      5       7       3    0.0%    0.0%       356      0     70      32
13 - (         BEEPER)    100      36       3    0.3%    0.0%      3700      1   1393      27
16 - (           BARO)     25     116      19    0.2%    0.0%     19419      6    982      66
17 - (       ALTITUDE)     39     134      39    0.5%    0.1%     18314     18    528      67
19 - (      TELEMETRY)    250      11       0    0.2%    0.0%     18623     14   3495       0
20 - (       LEDSTRIP)     79     111      16    0.8%    0.1%     10431      2   1371      75
21 - (            OSD)     18       0       1    0.0%    0.0%    662483      0      0       8
22 - (PIXEL_OSD_VIDEO)     60       4       0    0.0%    0.0%       259      0    843      10
24 - (            CMS)     20      22       6    0.0%    0.0%      1283      0    280      36
25 - (        VTXCTRL)      5       7       3    0.0%    0.0%       161      0     70      26
26 - (        CAMCTRL)      5       6       1    0.0%    0.0%       177      0     70      31
28 - (    ADCINTERNAL)      1       8       4    0.0%    0.0%        77      0     14      27
29 - (       PINIOBOX)     20      31       5    0.0%    0.0%      1151      0    280      55
RX Check Function                  14       0                    573993
Total (excluding SERIAL)                                41.1%

@SteveCEvans Can you remind me what you suggested checking in the debugger again? I can't find your comment.
EDIT: found it: "stick a breakpoint in the scheduler and then when your OSD hangs enable the breakpoint and examine tasks[TASK_OSD]->anticipatedExecutionTime >> TASK_EXEC_TIME_SHIFT"

@hydra
Copy link
Contributor Author

hydra commented Apr 7, 2022

The uptime is now: 14456 and the task has resumed by itself again.

@SteveCEvans Given that the scheduler eventually resumes the osd task, under what conditions would you expect a high anticipatedExecutionTime, which will never change due to the task not running, to cause the scheduler to resume the task?

@github-actions
Copy link

github-actions bot commented May 8, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs within a week.

@github-actions github-actions bot added the Inactive Automatically detected and labeled, will be closed after another week of inactivity. label May 8, 2022
@hydra
Copy link
Contributor Author

hydra commented May 8, 2022

This is still happening.

@github-actions github-actions bot removed the Inactive Automatically detected and labeled, will be closed after another week of inactivity. label May 9, 2022
@github-actions
Copy link

github-actions bot commented Jun 9, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs within a week.

@github-actions github-actions bot added the Inactive Automatically detected and labeled, will be closed after another week of inactivity. label Jun 9, 2022
@hydra
Copy link
Contributor Author

hydra commented Jun 15, 2022

Still occuring on a branch based of 4.3.0.

# version
# Betaflight / SPRACINGH7RF (SP7R) 4.3.0 Jun 15 2022 / 13:44:01 (b95fa0f8bb) MSP API: 1.44

# status
MCU H730 Clock=520MHz, Vref=3.27V, Core temp=58degC
...
System Uptime: 4568 seconds, Current Time: 2022-06-15T13:02:28.916+00:00
# tasks
Task list             rate/hz  max/us  avg/us maxload avgload  total/ms   late    run reqd/us
00 - (         SYSTEM)     10       1       0    0.0%    0.0%        43      0     42       3
01 - (         SYSTEM)    989      35       3    3.4%    0.2%     12139      8   4110      30
02 - (           GYRO)   8013      47       9   37.6%    7.7%    343071      0  33480       0
03 - (         FILTER)   8013      51       6   40.8%    4.8%    239136      0  33480       0
04 - (            PID)   8013     114      17   91.3%   14.3%    801984      0  33480       0
05 - (            ACC)    969      51      12    4.9%    1.1%     54658      5   4019      50
06 - (       ATTITUDE)     99     114      31    1.1%    0.3%     16051      4    408      55
07 - (             RX)    500      70      36    3.5%    1.8%    212561     10   6273      52
08 - (         SERIAL)    100   18504       5  185.0%    0.0%     23877      0    417      41
09 - (       DISPATCH)    983      38       1    3.7%    0.1%      7233      7   4106      28
10 - (BATTERY_VOLTAGE)     50      11       4    0.0%    0.0%      1128      0    209      43
11 - (BATTERY_CURRENT)     50       9       1    0.0%    0.0%      1446      1    209       7
12 - ( BATTERY_ALERTS)      5      42       5    0.0%    0.0%       147      1     21      41
13 - (         BEEPER)    100      17       4    0.1%    0.0%      1431      0    417      22
16 - (           BARO)     24      77      24    0.1%    0.0%      8085      1    288     177
17 - (       ALTITUDE)     38     120      55    0.4%    0.2%      8385      5    160      67
19 - (      TELEMETRY)    249       5       0    0.1%    0.0%     17150      7   1043       3
20 - (       LEDSTRIP)     99      60       8    0.5%    0.0%      4398      0    410      71
21 - (            OSD)     50       0       0    0.0%    0.0%    132841      0      0       7
22 - (PIXEL_OSD_VIDEO)     60       2       0    0.0%    0.0%       105      2    251       1
24 - (            CMS)     20      18       4    0.0%    0.0%       398      2     84      17
25 - (        VTXCTRL)      5       7       2    0.0%    0.0%      1252      0     21      66
26 - (        CAMCTRL)      5       1       1    0.0%    0.0%        20      0     21       4
28 - (    ADCINTERNAL)      1       5       4    0.0%    0.0%        30      0      4      46
29 - (       PINIOBOX)     20      43       7    0.0%    0.0%       616      1     84      42
RX Check Function                  14       1                     88818
Total (excluding SERIAL)                                30.5%

@SteveCEvans Note that the OSD task max/us and avg/us and run all being 0 indicate that the scheduler is not scheduling the OSD task and is starving it.

@github-actions github-actions bot removed the Inactive Automatically detected and labeled, will be closed after another week of inactivity. label Jun 16, 2022
@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs within a week.

@github-actions github-actions bot added the Inactive Automatically detected and labeled, will be closed after another week of inactivity. label Jul 17, 2022
@hydra
Copy link
Contributor Author

hydra commented Jul 17, 2022

@haslinghuis can you mark this as BUG so the bot doesn't keep marking it as stale?

@haslinghuis haslinghuis added BUG Bugs are excluded from automatically being marked as stale and removed Inactive Automatically detected and labeled, will be closed after another week of inactivity. labels Jul 17, 2022
@SteveCEvans
Copy link
Member

@hydra What state is the OSD task in when it hangs, and what are it’s predicted state durations?

@hydra
Copy link
Contributor Author

hydra commented Jul 19, 2022

The state is always the 'idle' state when it hangs, this one:

https://github.com/betaflight/betaflight/blob/master/src/main/osd/osd.c#L1156-L1176

It never actually runs the osd main loop.

https://github.com/betaflight/betaflight/blob/master/src/main/osd/osd.c#L1179

And at some point the scheduler decides to schedule the task again and it unfreezes the osd task until it happens again.

I'll hook a debugger up now and see if I can catch it and get the predicted state durations. Since it does it infrequently it might take a while to get some numbers for you, but you should be able to assume either, a) very small, b) normal, c) very big and then perhaps look into adjusting the scheduler so it doesn't block the task when either a,b or c occurs. finding out /why/ a,b or c occurs is a secondary problem.

@hydra
Copy link
Contributor Author

hydra commented Jul 19, 2022

Here's a dump of the scheduler's view of the OSD task, the OSD task state is, as always, IDLE, as confirmed by the code-path taken on the disassembly below.

Screenshot 2022-07-19 115600

The scheduler is only running the task check function, and not the task itself. So I can't see the state durations as they're static and the debugger can't resolve the symbols unless the method is called, I'll have to move them outside the method, recompile and catch it again...
Screenshot 2022-07-19 115828

Here you can see the disassembly and registers of the cmpTimeUs
Screenshot 2022-07-19 115940

hydra added a commit to spracing/betaflight that referenced this issue Jul 19, 2022
 Please enter the commit message for your changes. Lines starting
@hydra
Copy link
Contributor Author

hydra commented Jul 19, 2022

I made a branch based off 4.3.1-maintenance for the H7RF and included the temporary commit to expose the OSD state outside of the osdUpdate method, it's here: https://github.com/spracing/betaflight/commits/bf-h7rf-wip-25

@hydra
Copy link
Contributor Author

hydra commented Jul 19, 2022

@SteveCEvans here you go:

Screenshot 2022-07-19 223450
Screenshot 2022-07-19 223502
Screenshot 2022-07-19 223518
Screenshot 2022-07-19 223536

@hydra
Copy link
Contributor Author

hydra commented Jul 19, 2022

and here's the full tasks, as viewed by the scheduler:

Name : tasks
	Details:
{
{attribute = 0x20001604 <task_attributes>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 100079, lastExecutedAtUs = 1191555727, lastSignaledAtUs = 0, lastDesiredAt = 1186098112, movingAverageCycleTimeUs = 100062.438, anticipatedExecutionTime = 554, movingSumDeltaTime10thUs = 8005083, movingSumExecutionTime10thUs = 21, maxExecutionTimeUs = 36, totalExecutionTimeUs = 113556, lastStatsAtUs = 1191555727, runCount = 140710, lateCount = 144, execTime = 4}, 
{attribute = 0x2000161c <task_attributes+24>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 1002, lastExecutedAtUs = 1191560666, lastSignaledAtUs = 0, lastDesiredAt = 1019121112, movingAverageCycleTimeUs = 1003.77277, anticipatedExecutionTime = 576, movingSumDeltaTime10thUs = 80394, movingSumExecutionTime10thUs = 67, maxExecutionTimeUs = 67, totalExecutionTimeUs = 22147269, lastStatsAtUs = 1191560666, runCount = 13904023, lateCount = 9584, execTime = 4}, 
{attribute = 0x20001634 <task_attributes+48>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 54, lastExecutedAtUs = 1191561272, lastSignaledAtUs = 0, lastDesiredAt = 1022013987, movingAverageCycleTimeUs = 125.654915, anticipatedExecutionTime = 3582, movingSumDeltaTime10thUs = 10077, movingSumExecutionTime10thUs = 873, maxExecutionTimeUs = 68, totalExecutionTimeUs = 977758094, lastStatsAtUs = 1191561272, runCount = 111255327, lateCount = 0, execTime = 0}, 
{attribute = 0x2000164c <task_attributes+72>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 54, lastExecutedAtUs = 1191561272, lastSignaledAtUs = 0, lastDesiredAt = 1022013987, movingAverageCycleTimeUs = 125.654915, anticipatedExecutionTime = 3070, movingSumDeltaTime10thUs = 10077, movingSumExecutionTime10thUs = 709, maxExecutionTimeUs = 60, totalExecutionTimeUs = 518090053, lastStatsAtUs = 1191561272, runCount = 111255327, lateCount = 0, execTime = 0}, 
{attribute = 0x20001664 <task_attributes+96>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 54, lastExecutedAtUs = 1191561272, lastSignaledAtUs = 0, lastDesiredAt = 1022013987, movingAverageCycleTimeUs = 125.654915, anticipatedExecutionTime = 15742, movingSumDeltaTime10thUs = 10077, movingSumExecutionTime10thUs = 2148, maxExecutionTimeUs = 158, totalExecutionTimeUs = 1861389497, lastStatsAtUs = 1191561272, runCount = 111255327, lateCount = 0, execTime = 0}, 
{attribute = 0x2000167c <task_attributes+120>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 1066, lastExecutedAtUs = 1191560548, lastSignaledAtUs = 0, lastDesiredAt = 935812112, movingAverageCycleTimeUs = 1005.90723, anticipatedExecutionTime = 1768, movingSumDeltaTime10thUs = 80821, movingSumExecutionTime10thUs = 710, maxExecutionTimeUs = 80, totalExecutionTimeUs = 135574770, lastStatsAtUs = 1191560548, runCount = 13820714, lateCount = 8540, execTime = 13}, 
{attribute = 0x20001694 <task_attributes+144>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 10000, lastExecutedAtUs = 1191558926, lastSignaledAtUs = 0, lastDesiredAt = 1125918112, movingAverageCycleTimeUs = 10015.1543, anticipatedExecutionTime = 4829, movingSumDeltaTime10thUs = 801264, movingSumExecutionTime10thUs = 1190, maxExecutionTimeUs = 88, totalExecutionTimeUs = 31686898, lastStatsAtUs = 1191558926, runCount = 1401082, lateCount = 994, execTime = 37}, 
{attribute = 0x200016ac <task_attributes+168>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 100176, lastExecutedAtUs = 1191555670, lastSignaledAtUs = 1191555610, lastDesiredAt = 4198052620, movingAverageCycleTimeUs = 31669.4492, anticipatedExecutionTime = 7808, movingSumDeltaTime10thUs = 8005689, movingSumExecutionTime10thUs = 4099, maxExecutionTimeUs = 136, totalExecutionTimeUs = 26440652, lastStatsAtUs = 1191555670, runCount = 422004, lateCount = 8959, execTime = 67}, 
{attribute = 0x200016c4 <task_attributes+192>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 10064, lastExecutedAtUs = 1191556542, lastSignaledAtUs = 0, lastDesiredAt = 3295585408, movingAverageCycleTimeUs = 10010.0107, anticipatedExecutionTime = 2272, movingSumDeltaTime10thUs = 801024, movingSumExecutionTime10thUs = 183, maxExecutionTimeUs = 256, totalExecutionTimeUs = 43368663, lastStatsAtUs = 1191556542, runCount = 1188552, lateCount = 0, execTime = 17}, 
{attribute = 0x200016dc <task_attributes+216>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 1004, lastExecutedAtUs = 1191560542, lastSignaledAtUs = 0, lastDesiredAt = 1039365112, movingAverageCycleTimeUs = 1004.49768, anticipatedExecutionTime = 276, movingSumDeltaTime10thUs = 80553, movingSumExecutionTime10thUs = 27, maxExecutionTimeUs = 40, totalExecutionTimeUs = 5047650, lastStatsAtUs = 1191560542, runCount = 13924267, lateCount = 26093, execTime = 2}, 
{attribute = 0x200016f4 <task_attributes+240>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 20050, lastExecutedAtUs = 1191545550, lastSignaledAtUs = 0, lastDesiredAt = 1175838112, movingAverageCycleTimeUs = 20009.9043, anticipatedExecutionTime = 1524, movingSumDeltaTime10thUs = 1600902, movingSumExecutionTime10thUs = 230, maxExecutionTimeUs = 57, totalExecutionTimeUs = 2358736, lastStatsAtUs = 1191545550, runCount = 703037, lateCount = 652, execTime = 11}, 
{attribute = 0x2000170c <task_attributes+264>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 20000, lastExecutedAtUs = 1191555337, lastSignaledAtUs = 0, lastDesiredAt = 1174378112, movingAverageCycleTimeUs = 20012.0898, anticipatedExecutionTime = 2423, movingSumDeltaTime10thUs = 1600901, movingSumExecutionTime10thUs = 413, maxExecutionTimeUs = 58, totalExecutionTimeUs = 4385094, lastStatsAtUs = 1191555337, runCount = 702964, lateCount = 540, execTime = 18}, 
{attribute = 0x20001724 <task_attributes+288>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 200001, lastExecutedAtUs = 1191369474, lastSignaledAtUs = 0, lastDesiredAt = 1188498112, movingAverageCycleTimeUs = 200003.781, anticipatedExecutionTime = 2133, movingSumDeltaTime10thUs = 16000193, movingSumExecutionTime10thUs = 292, maxExecutionTimeUs = 45, totalExecutionTimeUs = 455971, lastStatsAtUs = 1191369474, runCount = 70367, lateCount = 53, execTime = 16}, 
{attribute = 0x2000173c <task_attributes+312>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 10063, lastExecutedAtUs = 1191556554, lastSignaledAtUs = 0, lastDesiredAt = 3295585408, movingAverageCycleTimeUs = 10009.8408, anticipatedExecutionTime = 1054, movingSumDeltaTime10thUs = 801001, movingSumExecutionTime10thUs = 166, maxExecutionTimeUs = 44, totalExecutionTimeUs = 3707848, lastStatsAtUs = 1191556554, runCount = 1188552, lateCount = 979, execTime = 8}, 
{attribute = 0x20001754 <task_attributes+336>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 0, lastExecutedAtUs = 0, lastSignaledAtUs = 0, lastDesiredAt = 0, movingAverageCycleTimeUs = 0, anticipatedExecutionTime = 0, movingSumDeltaTime10thUs = 0, movingSumExecutionTime10thUs = 0, maxExecutionTimeUs = 0, totalExecutionTimeUs = 0, lastStatsAtUs = 0, runCount = 0, lateCount = 0, execTime = 0}, 
{attribute = 0x2000176c <task_attributes+360>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 0, lastExecutedAtUs = 0, lastSignaledAtUs = 0, lastDesiredAt = 0, movingAverageCycleTimeUs = 0, anticipatedExecutionTime = 0, movingSumDeltaTime10thUs = 0, movingSumExecutionTime10thUs = 0, maxExecutionTimeUs = 0, totalExecutionTimeUs = 0, lastStatsAtUs = 0, runCount = 0, lateCount = 0, execTime = 0}, 
{attribute = 0x20001784 <task_attributes+384>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 1005, lastExecutedAtUs = 1191542303, lastSignaledAtUs = 0, lastDesiredAt = 4038487821, movingAverageCycleTimeUs = 11904.0713, anticipatedExecutionTime = 12032, movingSumDeltaTime10thUs = 3004163, movingSumExecutionTime10thUs = 1354, maxExecutionTimeUs = 96, totalExecutionTimeUs = 23733371, lastStatsAtUs = 1191538183, runCount = 1192219, lateCount = 166, execTime = 76}, 
{attribute = 0x2000179c <task_attributes+408>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 25093, lastExecutedAtUs = 1191542553, lastSignaledAtUs = 0, lastDesiredAt = 462648112, movingAverageCycleTimeUs = 25033.1738, anticipatedExecutionTime = 5848, movingSumDeltaTime10thUs = 2002845, movingSumExecutionTime10thUs = 2344, maxExecutionTimeUs = 111, totalExecutionTimeUs = 23126576, lastStatsAtUs = 1191542553, runCount = 533902, lateCount = 503, execTime = 45}, 
{attribute = 0x200017b4 <task_attributes+432>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 0, lastExecutedAtUs = 0, lastSignaledAtUs = 0, lastDesiredAt = 0, movingAverageCycleTimeUs = 0, anticipatedExecutionTime = 0, movingSumDeltaTime10thUs = 0, movingSumExecutionTime10thUs = 0, maxExecutionTimeUs = 0, totalExecutionTimeUs = 0, lastStatsAtUs = 0, runCount = 0, lateCount = 0, execTime = 0}, 
{attribute = 0x200017cc <task_attributes+456>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 4003, lastExecutedAtUs = 1191560564, lastSignaledAtUs = 0, lastDesiredAt = 186322112, movingAverageCycleTimeUs = 4011.49438, anticipatedExecutionTime = 495, movingSumDeltaTime10thUs = 321347, movingSumExecutionTime10thUs = 27, maxExecutionTimeUs = 83, totalExecutionTimeUs = 42939876, lastStatsAtUs = 1191560564, runCount = 3267806, lateCount = 9078, execTime = 3}, 
{attribute = 0x200017e4 <task_attributes+480>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 10082, lastExecutedAtUs = 1191551792, lastSignaledAtUs = 0, lastDesiredAt = 3294995408, movingAverageCycleTimeUs = 10020.8789, anticipatedExecutionTime = 5753, movingSumDeltaTime10thUs = 801846, movingSumExecutionTime10thUs = 621, maxExecutionTimeUs = 103, totalExecutionTimeUs = 15363591, lastStatsAtUs = 1191551792, runCount = 1188493, lateCount = 24813, execTime = 44}, 
{attribute = 0x200017fc <task_attributes+504>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 20251, lastExecutedAtUs = 4256951772, lastSignaledAtUs = 4256951772, lastDesiredAt = 356512830, movingAverageCycleTimeUs = 9.49466705, anticipatedExecutionTime = 384, movingSumDeltaTime10thUs = 1602307, movingSumExecutionTime10thUs = 49, maxExecutionTimeUs = 798, totalExecutionTimeUs = 1312614187, lastStatsAtUs = 4256932184, runCount = 869271643, lateCount = 6665563, execTime = 4}, 
{attribute = 0x20001814 <task_attributes+528>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 16687, lastExecutedAtUs = 1191555925, lastSignaledAtUs = 1191555925, lastDesiredAt = 1200151354, movingAverageCycleTimeUs = 16661.4707, anticipatedExecutionTime = 412, movingSumDeltaTime10thUs = 1333133, movingSumExecutionTime10thUs = 44, maxExecutionTimeUs = 37, totalExecutionTimeUs = 307376, lastStatsAtUs = 1191555925, runCount = 845137, lateCount = 1898, execTime = 3}, 
{attribute = 0x2000182c <task_attributes+552>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 0, lastExecutedAtUs = 0, lastSignaledAtUs = 0, lastDesiredAt = 0, movingAverageCycleTimeUs = 0, anticipatedExecutionTime = 0, movingSumDeltaTime10thUs = 0, movingSumExecutionTime10thUs = 0, maxExecutionTimeUs = 0, totalExecutionTimeUs = 0, lastStatsAtUs = 0, runCount = 0, lateCount = 0, execTime = 0}, 
{attribute = 0x20001844 <task_attributes+576>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 50087, lastExecutedAtUs = 1191555827, lastSignaledAtUs = 0, lastDesiredAt = 3584115408, movingAverageCycleTimeUs = 50030.7812, anticipatedExecutionTime = 1342, movingSumDeltaTime10thUs = 4002536, movingSumExecutionTime10thUs = 138, maxExecutionTimeUs = 40, totalExecutionTimeUs = 701923, lastStatsAtUs = 1191555827, runCount = 243481, lateCount = 186, execTime = 10}, 
{attribute = 0x2000185c <task_attributes+600>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 200000, lastExecutedAtUs = 1191529561, lastSignaledAtUs = 0, lastDesiredAt = 3302065408, movingAverageCycleTimeUs = 200034.719, anticipatedExecutionTime = 8867, movingSumDeltaTime10thUs = 16002664, movingSumExecutionTime10thUs = 94, maxExecutionTimeUs = 153, totalExecutionTimeUs = 3580090, lastStatsAtUs = 1191529561, runCount = 59460, lateCount = 6793, execTime = 69}, 
{attribute = 0x20001874 <task_attributes+624>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 200013, lastExecutedAtUs = 1191399962, lastSignaledAtUs = 0, lastDesiredAt = 1042498112, movingAverageCycleTimeUs = 200003.859, anticipatedExecutionTime = 544, movingSumDeltaTime10thUs = 16000270, movingSumExecutionTime10thUs = 29, maxExecutionTimeUs = 35, totalExecutionTimeUs = 28450, lastStatsAtUs = 1191399962, runCount = 69637, lateCount = 87, execTime = 4},
{attribute = 0x2000188c <task_attributes+648>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 0, lastExecutedAtUs = 0, lastSignaledAtUs = 0, lastDesiredAt = 0, movingAverageCycleTimeUs = 0, anticipatedExecutionTime = 0, movingSumDeltaTime10thUs = 0, movingSumExecutionTime10thUs = 0, maxExecutionTimeUs = 0, totalExecutionTimeUs = 0, lastStatsAtUs = 0, runCount = 0, lateCount = 0, execTime = 0},
{attribute = 0x200018a4 <task_attributes+672>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 1000003, lastExecutedAtUs = 1190946608, lastSignaledAtUs = 0, lastDesiredAt = 1147098112, movingAverageCycleTimeUs = 1000028.06, anticipatedExecutionTime = 1277, movingSumDeltaTime10thUs = 80000821, movingSumExecutionTime10thUs = 418, maxExecutionTimeUs = 43, totalExecutionTimeUs = 86029, lastStatsAtUs = 1190946608, runCount = 14032, lateCount = 8, execTime = 9},
{attribute = 0x200018bc <task_attributes+696>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 50037, lastExecutedAtUs = 1191529446, lastSignaledAtUs = 0, lastDesiredAt = 3491565408, movingAverageCycleTimeUs = 50013.0859, anticipatedExecutionTime = 1136, movingSumDeltaTime10thUs = 4001082, movingSumExecutionTime10thUs = 339, maxExecutionTimeUs = 40, totalExecutionTimeUs = 1188524, lastStatsAtUs = 1191529446, runCount = 241630, lateCount = 178, execTime = 8},
{attribute = 0x200018d4 <task_attributes+720>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 0, lastExecutedAtUs = 0, lastSignaledAtUs = 0, lastDesiredAt = 0, movingAverageCycleTimeUs = 0, anticipatedExecutionTime = 0, movingSumDeltaTime10thUs = 0, movingSumExecutionTime10thUs = 0, maxExecutionTimeUs = 0, totalExecutionTimeUs = 0, lastStatsAtUs = 0, runCount = 0, lateCount = 0, execTime = 0}}```

@hydra
Copy link
Contributor Author

hydra commented Jul 19, 2022

to catch what it looked like when the scheduler finally called osdUpdate a breakpoint was set at first line of osdUpdate, here's the result:

Name : tasks
	Details:
{
{attribute = 0x20001604 <task_attributes>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 100079, lastExecutedAtUs = 2109381362, lastSignaledAtUs = 0, lastDesiredAt = 2103598112, movingAverageCycleTimeUs = 100072.547, anticipatedExecutionTime = 567, movingSumDeltaTime10thUs = 8005622, movingSumExecutionTime10thUs = 32, maxExecutionTimeUs = 36, totalExecutionTimeUs = 116264, lastStatsAtUs = 2109381362, runCount = 149885, lateCount = 178, execTime = 4},
{attribute = 0x2000161c <task_attributes+24>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 1003, lastExecutedAtUs = 2109465329, lastSignaledAtUs = 0, lastDesiredAt = 1932798112, movingAverageCycleTimeUs = 1002.77924, anticipatedExecutionTime = 424, movingSumDeltaTime10thUs = 80179, movingSumExecutionTime10thUs = 56, maxExecutionTimeUs = 67, totalExecutionTimeUs = 22826720, lastStatsAtUs = 2109465329, runCount = 14817700, lateCount = 10300, execTime = 3},
{attribute = 0x20001634 <task_attributes+48>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 125, lastExecutedAtUs = 2109465470, lastSignaledAtUs = 0, lastDesiredAt = 1939298737, movingAverageCycleTimeUs = 124.912956, anticipatedExecutionTime = 1580, movingSumDeltaTime10thUs = 9993, movingSumExecutionTime10thUs = 688, maxExecutionTimeUs = 68, totalExecutionTimeUs = 1041140659, lastStatsAtUs = 2109465470, runCount = 118593605, lateCount = 0, execTime = 0},
{attribute = 0x2000164c <task_attributes+72>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 125, lastExecutedAtUs = 2109465470, lastSignaledAtUs = 0, lastDesiredAt = 1939298737, movingAverageCycleTimeUs = 124.912956, anticipatedExecutionTime = 1336, movingSumDeltaTime10thUs = 9993, movingSumExecutionTime10thUs = 336, maxExecutionTimeUs = 60, totalExecutionTimeUs = 551268528, lastStatsAtUs = 2109465470, runCount = 118593605, lateCount = 0, execTime = 0},
{attribute = 0x20001664 <task_attributes+96>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 125, lastExecutedAtUs = 2109465470, lastSignaledAtUs = 0, lastDesiredAt = 1939298737, movingAverageCycleTimeUs = 124.912956, anticipatedExecutionTime = 5087, movingSumDeltaTime10thUs = 9993, movingSumExecutionTime10thUs = 1216, maxExecutionTimeUs = 158, totalExecutionTimeUs = 1979715936, lastStatsAtUs = 2109465470, runCount = 118593605, lateCount = 0, execTime = 0},
{attribute = 0x2000167c <task_attributes+120>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 1001, lastExecutedAtUs = 2109465301, lastSignaledAtUs = 0, lastDesiredAt = 1848727112, movingAverageCycleTimeUs = 1004.362, anticipatedExecutionTime = 1758, movingSumDeltaTime10thUs = 80336, movingSumExecutionTime10thUs = 682, maxExecutionTimeUs = 80, totalExecutionTimeUs = 143614062, lastStatsAtUs = 2109465301, runCount = 14733629, lateCount = 8930, execTime = 13},
{attribute = 0x20001694 <task_attributes+144>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 10003, lastExecutedAtUs = 2109456668, lastSignaledAtUs = 0, lastDesiredAt = 2042308112, movingAverageCycleTimeUs = 10009.4648, anticipatedExecutionTime = 3659, movingSumDeltaTime10thUs = 800615, movingSumExecutionTime10thUs = 1000, maxExecutionTimeUs = 88, totalExecutionTimeUs = 32904283, lastStatsAtUs = 2109456668, runCount = 1492721, lateCount = 1056, execTime = 28},
{attribute = 0x200016ac <task_attributes+168>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 100179, lastExecutedAtUs = 2109381303, lastSignaledAtUs = 2109381235, lastDesiredAt = 736902672, movingAverageCycleTimeUs = 31671.9453, anticipatedExecutionTime = 6144, movingSumDeltaTime10thUs = 8006460, movingSumExecutionTime10thUs = 4064, maxExecutionTimeUs = 136, totalExecutionTimeUs = 27856620, lastStatsAtUs = 2109381303, runCount = 449520, lateCount = 9042, execTime = 59},
{attribute = 0x200016c4 <task_attributes+192>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 10002, lastExecutedAtUs = 2109463430, lastSignaledAtUs = 0, lastDesiredAt = 4212645408, movingAverageCycleTimeUs = 10006.3535, anticipatedExecutionTime = 1127, movingSumDeltaTime10thUs = 800437, movingSumExecutionTime10thUs = 184, maxExecutionTimeUs = 256, totalExecutionTimeUs = 43518627, lastStatsAtUs = 2109463430, runCount = 1280258, lateCount = 0, execTime = 8},
{attribute = 0x200016dc <task_attributes+216>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 1000, lastExecutedAtUs = 2109465321, lastSignaledAtUs = 0, lastDesiredAt = 1953182112, movingAverageCycleTimeUs = 1003.13098, anticipatedExecutionTime = 247, movingSumDeltaTime10thUs = 80198, movingSumExecutionTime10thUs = 39, maxExecutionTimeUs = 40, totalExecutionTimeUs = 5373674, lastStatsAtUs = 2109465321, runCount = 14838084, lateCount = 27288, execTime = 1},
{attribute = 0x200016f4 <task_attributes+240>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 20001, lastExecutedAtUs = 2109461519, lastSignaledAtUs = 0, lastDesiredAt = 2093138112, movingAverageCycleTimeUs = 20014.8418, anticipatedExecutionTime = 1533, movingSumDeltaTime10thUs = 1601382, movingSumExecutionTime10thUs = 254, maxExecutionTimeUs = 57, totalExecutionTimeUs = 2508208, lastStatsAtUs = 2109461519, runCount = 748902, lateCount = 691, execTime = 11},
{attribute = 0x2000170c <task_attributes+264>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 20000, lastExecutedAtUs = 2109465060, lastSignaledAtUs = 0, lastDesiredAt = 2091658112, movingAverageCycleTimeUs = 20009.8223, anticipatedExecutionTime = 1441, movingSumDeltaTime10thUs = 1600681, movingSumExecutionTime10thUs = 387, maxExecutionTimeUs = 58, totalExecutionTimeUs = 4614869, lastStatsAtUs = 2109465060, runCount = 748828, lateCount = 577, execTime = 11},
{attribute = 0x20001724 <task_attributes+288>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 200005, lastExecutedAtUs = 2109426146, lastSignaledAtUs = 0, lastDesiredAt = 2106498112, movingAverageCycleTimeUs = 200009.75, anticipatedExecutionTime = 916, movingSumDeltaTime10thUs = 16000753, movingSumExecutionTime10thUs = 311, maxExecutionTimeUs = 45, totalExecutionTimeUs = 473274, lastStatsAtUs = 2109426146, runCount = 74957, lateCount = 54, execTime = 7},
{attribute = 0x2000173c <task_attributes+312>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 10000, lastExecutedAtUs = 2109455781, lastSignaledAtUs = 0, lastDesiredAt = 4212555408, movingAverageCycleTimeUs = 10006.5518, anticipatedExecutionTime = 858, movingSumDeltaTime10thUs = 800518, movingSumExecutionTime10thUs = 190, maxExecutionTimeUs = 44, totalExecutionTimeUs = 3923018, lastStatsAtUs = 2109455781, runCount = 1280249, lateCount = 1048, execTime = 6},
{attribute = 0x20001754 <task_attributes+336>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 0, lastExecutedAtUs = 0, lastSignaledAtUs = 0, lastDesiredAt = 0, movingAverageCycleTimeUs = 0, anticipatedExecutionTime = 0, movingSumDeltaTime10thUs = 0, movingSumExecutionTime10thUs = 0, maxExecutionTimeUs = 0, totalExecutionTimeUs = 0, lastStatsAtUs = 0, runCount = 0, lateCount = 0, execTime = 0},
{attribute = 0x2000176c <task_attributes+360>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 0, lastExecutedAtUs = 0, lastSignaledAtUs = 0, lastDesiredAt = 0, movingAverageCycleTimeUs = 0, anticipatedExecutionTime = 0, movingSumDeltaTime10thUs = 0, movingSumExecutionTime10thUs = 0, maxExecutionTimeUs = 0, totalExecutionTimeUs = 0, lastStatsAtUs = 0, runCount = 0, lateCount = 0, execTime = 0},
{attribute = 0x20001784 <task_attributes+384>, dynamicPriority = 3, taskAgePeriods = 1, taskLatestDeltaTimeUs = 1006, lastExecutedAtUs = 2109397291, lastSignaledAtUs = 0, lastDesiredAt = 223428502, movingAverageCycleTimeUs = 11892.7969, anticipatedExecutionTime = 12032, movingSumDeltaTime10thUs = 3001511, movingSumExecutionTime10thUs = 1561, maxExecutionTimeUs = 96, totalExecutionTimeUs = 24983851, lastStatsAtUs = 2109393174, runCount = 1260877, lateCount = 166, execTime = 94},
{attribute = 0x2000179c <task_attributes+408>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 25081, lastExecutedAtUs = 2109458262, lastSignaledAtUs = 0, lastDesiredAt = 1379223112, movingAverageCycleTimeUs = 25031.0605, anticipatedExecutionTime = 5651, movingSumDeltaTime10thUs = 2002592, movingSumExecutionTime10thUs = 2016, maxExecutionTimeUs = 111, totalExecutionTimeUs = 24133246, lastStatsAtUs = 2109458262, runCount = 570565, lateCount = 524, execTime = 44},
{attribute = 0x200017b4 <task_attributes+432>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 0, lastExecutedAtUs = 0, lastSignaledAtUs = 0, lastDesiredAt = 0, movingAverageCycleTimeUs = 0, anticipatedExecutionTime = 0, movingSumDeltaTime10thUs = 0, movingSumExecutionTime10thUs = 0, maxExecutionTimeUs = 0, totalExecutionTimeUs = 0, lastStatsAtUs = 0, runCount = 0, lateCount = 0, execTime = 0},
{attribute = 0x200017cc <task_attributes+456>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 4002, lastExecutedAtUs = 2109463328, lastSignaledAtUs = 0, lastDesiredAt = 1102746112, movingAverageCycleTimeUs = 4003.08203, anticipatedExecutionTime = 622, movingSumDeltaTime10thUs = 320190, movingSumExecutionTime10thUs = 39, maxExecutionTimeUs = 83, totalExecutionTimeUs = 43017190, lastStatsAtUs = 2109463328, runCount = 3496912, lateCount = 9481, execTime = 4},
{attribute = 0x200017e4 <task_attributes+480>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 10001, lastExecutedAtUs = 2109455647, lastSignaledAtUs = 0, lastDesiredAt = 4209525408, movingAverageCycleTimeUs = 10015.8438, anticipatedExecutionTime = 5621, movingSumDeltaTime10thUs = 801217, movingSumExecutionTime10thUs = 1027, maxExecutionTimeUs = 103, totalExecutionTimeUs = 16136785, lastStatsAtUs = 2109455647, runCount = 1279946, lateCount = 24816, execTime = 43},
{attribute = 0x200017fc <task_attributes+504>, dynamicPriority = 0, taskAgePeriods = 1, taskLatestDeltaTimeUs = 20251, lastExecutedAtUs = 2109465561, lastSignaledAtUs = 2109465561, lastDesiredAt = 356529496, movingAverageCycleTimeUs = 9.49466705, anticipatedExecutionTime = 384, movingSumDeltaTime10thUs = 1602307, movingSumExecutionTime10thUs = 49, maxExecutionTimeUs = 798, totalExecutionTimeUs = 1312614187, lastStatsAtUs = 4256932184, runCount = 869271643, lateCount = 6665563, execTime = 3},
{attribute = 0x20001814 <task_attributes+528>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 16680, lastExecutedAtUs = 2109459380, lastSignaledAtUs = 2109459380, lastDesiredAt = 2117997972, movingAverageCycleTimeUs = 16650.1699, anticipatedExecutionTime = 124, movingSumDeltaTime10thUs = 1330720, movingSumExecutionTime10thUs = 20, maxExecutionTimeUs = 37, totalExecutionTimeUs = 328003, lastStatsAtUs = 2109459380, runCount = 900210, lateCount = 2033, execTime = 0},
{attribute = 0x2000182c <task_attributes+552>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 0, lastExecutedAtUs = 0, lastSignaledAtUs = 0, lastDesiredAt = 0, movingAverageCycleTimeUs = 0, anticipatedExecutionTime = 0, movingSumDeltaTime10thUs = 0, movingSumExecutionTime10thUs = 0, maxExecutionTimeUs = 0, totalExecutionTimeUs = 0, lastStatsAtUs = 0, runCount = 0, lateCount = 0, execTime = 0},
{attribute = 0x20001844 <task_attributes+576>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 50035, lastExecutedAtUs = 2109431401, lastSignaledAtUs = 0, lastDesiredAt = 206448112, movingAverageCycleTimeUs = 50033.9219, anticipatedExecutionTime = 1012, movingSumDeltaTime10thUs = 4002590, movingSumExecutionTime10thUs = 221, maxExecutionTimeUs = 40, totalExecutionTimeUs = 742121, lastStatsAtUs = 2109431401, runCount = 261827, lateCount = 193, execTime = 7},
{attribute = 0x2000185c <task_attributes+600>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 200011, lastExecutedAtUs = 2109379051, lastSignaledAtUs = 0, lastDesiredAt = 4219465408, movingAverageCycleTimeUs = 200064.438, anticipatedExecutionTime = 4280, movingSumDeltaTime10thUs = 16001919, movingSumExecutionTime10thUs = 59, maxExecutionTimeUs = 153, totalExecutionTimeUs = 3583318, lastStatsAtUs = 2109379051, runCount = 64047, lateCount = 6793, execTime = 33},
{attribute = 0x20001874 <task_attributes+624>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 200008, lastExecutedAtUs = 2109453012, lastSignaledAtUs = 0, lastDesiredAt = 1960498112, movingAverageCycleTimeUs = 200009.891, anticipatedExecutionTime = 457, movingSumDeltaTime10thUs = 16000795, movingSumExecutionTime10thUs = 28, maxExecutionTimeUs = 35, totalExecutionTimeUs = 30221, lastStatsAtUs = 2109453012, runCount = 74227, lateCount = 95, execTime = 3},
{attribute = 0x2000188c <task_attributes+648>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 0, lastExecutedAtUs = 0, lastSignaledAtUs = 0, lastDesiredAt = 0, movingAverageCycleTimeUs = 0, anticipatedExecutionTime = 0, movingSumDeltaTime10thUs = 0, movingSumExecutionTime10thUs = 0, maxExecutionTimeUs = 0, totalExecutionTimeUs = 0, lastStatsAtUs = 0, runCount = 0, lateCount = 0, execTime = 0},
{attribute = 0x200018a4 <task_attributes+672>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 1000002, lastExecutedAtUs = 2108969736, lastSignaledAtUs = 0, lastDesiredAt = 2065098112, movingAverageCycleTimeUs = 1000035.44, anticipatedExecutionTime = 1345, movingSumDeltaTime10thUs = 80002827, movingSumExecutionTime10thUs = 345, maxExecutionTimeUs = 43, totalExecutionTimeUs = 90177, lastStatsAtUs = 2108969736, runCount = 14950, lateCount = 8, execTime = 10},
{attribute = 0x200018bc <task_attributes+696>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 50052, lastExecutedAtUs = 2109431529, lastSignaledAtUs = 0, lastDesiredAt = 113898112, movingAverageCycleTimeUs = 50037.2773, anticipatedExecutionTime = 1561, movingSumDeltaTime10thUs = 4003138, movingSumExecutionTime10thUs = 379, maxExecutionTimeUs = 40, totalExecutionTimeUs = 1274187, lastStatsAtUs = 2109431529, runCount = 259976, lateCount = 189, execTime = 12},
{attribute = 0x200018d4 <task_attributes+720>, dynamicPriority = 0, taskAgePeriods = 0, taskLatestDeltaTimeUs = 0, lastExecutedAtUs = 0, lastSignaledAtUs = 0, lastDesiredAt = 0, movingAverageCycleTimeUs = 0, anticipatedExecutionTime = 0, movingSumDeltaTime10thUs = 0, movingSumExecutionTime10thUs = 0, maxExecutionTimeUs = 0, totalExecutionTimeUs = 0, lastStatsAtUs = 0, runCount = 0, lateCount = 0, execTime = 0}}
	Default:0x24006e28 <tasks>
	Decimal:604007976
	Hex:0x24006e28
	Binary:100100000000000110111000101000
	Octal:04400067050

Screenshot 2022-07-19 230811
Screenshot 2022-07-19 230831

@SteveCEvans Do you have enough now to debug the scheduler?

@SteveCEvans
Copy link
Member

anticipatedExecutionTime looks good, the next state expecting to take 3us to execute. The scheduler would be happy with that. We need to know that value when it’s not, and how it got set to that value.

@hydra
Copy link
Contributor Author

hydra commented Jul 20, 2022

To me it feels like there's an overflow bug in the scheduler, look at the all the last-executed tasks when it last executed:

Int max: 4294967295 / 0xFFFFFFFF, OSD lastExecutedAtUs = 4256951772.

The value of 'anticipatedExecutionTime' below, and in #11333 (comment) are the values when it's stalled / not happy.

{attribute = 0x200017e4, ..., lastExecutedAtUs = 1191551792, lastSignaledAtUs = 0, lastDesiredAt = 3294995408, movingAverageCycleTimeUs = 10020.8789, anticipatedExecutionTime = 5753, movingSumDeltaTime10thUs = 801846, movingSumExecutionTime10thUs = 621, maxExecutionTimeUs = 103, totalExecutionTimeUs = 15363591, lastStatsAtUs = 1191551792, runCount = 1188493, lateCount = 24813, execTime = 44}, 
{attribute = 0x200017fc, ..., lastExecutedAtUs = 4256951772, lastSignaledAtUs = 4256951772, lastDesiredAt = 356512830, movingAverageCycleTimeUs = 9.49466705, anticipatedExecutionTime = 384, movingSumDeltaTime10thUs = 1602307, movingSumExecutionTime10thUs = 49, maxExecutionTimeUs = 798, totalExecutionTimeUs = 1312614187, lastStatsAtUs = 4256932184, runCount = 869271643, lateCount = 6665563, execTime = 4}, 
{attribute = 0x20001814, ..., lastExecutedAtUs = 1191555925, lastSignaledAtUs = 1191555925, lastDesiredAt = 1200151354, movingAverageCycleTimeUs = 16661.4707, anticipatedExecutionTime = 412, movingSumDeltaTime10thUs = 1333133, movingSumExecutionTime10thUs = 44, maxExecutionTimeUs = 37, totalExecutionTimeUs = 307376, lastStatsAtUs = 1191555925, runCount = 845137, lateCount = 1898, execTime = 3}, 

The values in #11333 (comment) are when the task finally resumed, i.e. scheduler ran the 'osdUpdateCheck' and the time comparison caused the state to change from IDLE to CHECK and then the scheduler eventually ran the osdUpdate method.

Your comment is a bit unclear, what info do you still need, and under what condition?

@hydra
Copy link
Contributor Author

hydra commented Aug 1, 2022

@SteveCEvans Hey, I know you're probably very busy but just wondering if you're able to make some time to look this is issue?

@hydra
Copy link
Contributor Author

hydra commented Nov 22, 2022

I rebased the H7RF branch on master (127aa52) today, monitoring the situation as it's been a few months since this ticket was updated.

Rebased branch is here: https://github.com/spracing/betaflight/commits/bf-h7rf-wip-28

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BUG Bugs are excluded from automatically being marked as stale
Projects
None yet
Development

No branches or pull requests

3 participants