CFE_PSP_GetTime Loses Resolution in Conversion to Microseconds (GSFC DCR 14952) #76

skliper · 2019-09-25T23:16:07Z

There's a 'bug' in the implementation of the mcp750/vxworks PSP which causes the loss of timing resolution in the function CFE_PSP_GetTime(). for example:

{{{
void CFE_PSP_GetTime( OS_time_t *LocalTime)
{
uint32 DecCount;

/* Reads the time from the hardware register, then converts it
* into usable seconds and microseconds */

sysPciRead32(0xFC0011C0, (UINT32 )(&DecCount));
DecCount = DecCount & 0x7FFFFFFF;
DecCount = ((uint32) 0x0D6937E5) - DecCount;
LocalTime->seconds = DecCount / 8333311;
DecCount = DecCount % 8333311;
LocalTime->microsecs = (DecCount/8333) * 1000;
}/ end CFE_PSP_GetLocalTime */
}}}

In this case, (DecCount/8333) * 1000 is performed as an integer calculation (as DecCount and LocalTime->microsecs are integers), basically reducing resolution from microseconds (which this function can calculated from fractions of microseconds) to miliseconds.

NOTE: GSFC's rad750 version has a similar calculation ((DecCount / 2062) * 1000)

Should this be changed to something like (for mcp750):

{{{
LocalTime->microsecs = (uint32) (( ((double)DecCount) /8333.0) * 1000.0);
}}}

skliper · 2019-09-25T23:16:08Z

Imported from trac issue 72. Created by sstrege on 2016-08-08T16:39:59, last modified: 2019-08-14T14:13:14

skliper · 2019-10-02T14:47:14Z

@avan989 This would be a good one to look into if the issue still exists.

If it does I'd suggest:

LocalTime->microsecs = DecCount/8.333;

for simplest solution, and check if there's a test case to confirm this works as expected. Maybe even run on the target and confirm.

jphickey · 2019-10-28T01:55:41Z

I am not familiar with the MCP750 hardware timer register that is being read here, but I would strongly discourage using any floating-point arithmetic in the calculation. This will almost certainly result in reduced performance.

Note that suggested literal numbers like "8.333" or even "8333.0" or "1000.0" are of type double. This will cause all the integer values to be promoted to doubles, and value to be computed as a double, and then truncated back to an int in the final assignment.

The complication of the existing calculation is almost certainly done with the intent of keeping the calculation entirely within the integer domain. Integer ops are usually an order of magnitude faster.

skliper · 2019-10-28T13:56:00Z

Good performance point. Really the two lines of concern are:

PSP/fsw/mcp750-vxworks/src/cfe_psp_timer.c

Lines 102 to 103 in a82f7c9

    
           DecCount = DecCount % 8333311; 
        
           LocalTime->microsecs = (DecCount/8333) * 1000;

How about:

LocalTime->microsecs=(DecCount*1000000L)/8333311;

So it promotes to long and multiplies (never overflows because DecCount is always <8333311), does the full precision integer divide, then back into the uint32?

jphickey · 2019-10-28T15:53:58Z

But the intermediate value (DecCount*1000000L) may overflow....

It also looks like the original code has another problem where if the DecCount ends up being within the range of [8333000, 8333310], then the division will result in 1000, which will produce a microsecs value of 1000000, which is invalid, as it is supposed to be in range of [0,999999]. So I concur that this code definitely needs a fix.

My recommendation is to do the multiply before the divide, but limit your multiplier to avoid uint32 overrun. So long as DecCount is guaranteed to be <8333311, you should be able to multiply by up to 512 (or do a 9 bit shift) first. (8333311 * 512) < 2^32.

So either:

LocalTime->microsecs = (DecCount << 9) / 4267;

OR:

LocalTime->microsecs = (DecCount * 500) / 4167;

Neither are perfect because 83333311 doesn't have any common factors with 1000000. But both of these approaches will get you a much better microseconds approximation than the existing code does. The left shift might be slightly faster.

jphickey · 2019-10-28T16:26:17Z

Edit, just thought why not (3/25). This is 1/8.333333 and that is probably as close of an approximation as one can get with integer ratios, even better than my previous suggestions.

So, LocalTime->microsecs = (DecCount * 3) / 25;

Simple and probably the most accurate approximation of all.

skliper · 2019-10-28T18:25:58Z

But the intermediate value (DecCount*1000000L) may overflow....

Whoops, I meant LL (or ULL). Linux/gcc seems to promote it to 64 bit either way (I tested before I commented), but I haven't had a chance to figure out if that behavior is in the standard... either way probably more obvious as ULL.

So LocalTime->microsecs = (DecCount * 1000000ULL) / 8333311;

I'd prefer the exact conversion, vs 3/25 approximation, since they do give different answers. Biggest error is at max value to rollover your approximation would skip from 999997 to 0, where as ULL solution smoothly goes from 999999 to 0.

jphickey · 2019-10-28T18:50:04Z

Although C99 requires long long of [at least] 64 bits, the 64-bit ops are not necessarily implemented in hardware. On a 32 bit processor the 64 bit operation it is likely to be emulated in software. At that point the floating-point solution might be more performant.

All this is just speculation though - you might want to write a test program and benchmark all three. But I do think if you do, you'll find that the 3/25 ratio is the best compromise of speed and accuracy.

Note that in my experience control loops do often call some form of polling of "gettime" from high priority threads. So its worth making it as efficient/lightweight as possible.

skliper · 2019-10-28T20:13:08Z

Lacking context, how do you suggest we evaluate accuracy vs speed?

jphickey · 2019-10-28T20:14:04Z

Just had one more thought on this. How about:

LocalTime->microsecs = ((300 * DecCount) + (DecCount / 1244)) / 2500;

This is an effective conversion ratio of ((300 + (1/1244) / 2500), or approx 1/8.33331100399, so its almost exact, with probably less performance penalty than either of the other options, as it f just an extra integer division. It should roll over smoothly with no "skipped" values.

jphickey · 2019-10-28T20:16:28Z

You could test for speed by simply writing a small C program that implements all three of the conversion styles (32 bit int, 64 bit int, double floating point) and just run it in a tight loop for a few seconds and count iterations. You need to run it on the actual target hardware, of course.

skliper · 2019-10-28T21:01:56Z

I wasn't questioning how to test the speed of the solutions, but how do you suggest trading accuracy vs speed (3/25 is faster by x, but at the "cost" of up to 2 additional usec of error).

Note (300*DecCount) overflows on Linux w/ gcc withough declaring 300 as unsigned. So:

LocalTime->microsecs = ((300u * DecCount) + (DecCount / 1244)) / 2500;

Gives the exact same results as the ULL solution over the full range, on Linux. Definitely worth a speed check on all three. Since it's fun to guess, I bet this solution takes ~2x as long as 3/25, and is within 10% of ULL.

jphickey · 2019-10-29T00:20:04Z

I think for the tradeoff, it is just a subjective value decision at this point whether 2-3us of possible error is worth the extra CPU cycles that a more arithmetically-correct implementation produces. Obviously existing code was running OK with up to 1000us (or more!) of error in the existing implementation. To put it in context, most things in CFE apps don't get scheduled at a frequency higher than 100Hz or so anyway, usually more like 1Hz or less.

It does speak to the need for PSP requirements though. Otherwise one cannot say whether CFE_PSP_GetTime is behaving within requirements or not here, or be able to quantify if a slight loss of precision for a faster implementation is "worth it" or not. It is a judgement call for the platform's users to make.

This is also an area that IMO could use clean up because the MPC750 PSP passes a somewhat hw-specific timer up the stack that "rolls over" at weird intervals and there is a lot of corresponding complexity in both CFE_TIME and the PSP API to deal with that rollover, which is really specific to this PSP. But that is a separate issue, a discussion to be had when reviewing PSP requirements. I would definitely advocate for simplification here.

Pedantic note: you really shouldn't need the "u" suffix on literals in the code, because the DecCount is already a uint32, not an int32. So the literal will be promoted to unsigned to match before the multiplication occurs, and the result will be unsigned. If you tried this, my guess is your test variable in the test implementation might be signed? Usually I do not like to add the suffixes, they can have unintended consequences in cross platform environments where the C-standard type promotion rules usually work fine.

skliper · 2019-10-29T13:16:23Z

it is just a subjective value decision at this point

Definitely agree, which is why I lean towards just picking a solution, and documenting the other options in a comment. I'd prefer to avoid implying there is one right answer for all use cases, rather pick one present the trade and move on rather than spend too much time trying to pick the "right" one since it depends on context.

Obviously existing code was running OK with up to 1000us (or more!) of error in the existing implementation.

User submitted a bug report and a fix. I'd bet they "cloned and owned" the fix for their use rather than living with 1000us of error. I don't think your assumption here that 1000us of error was "OK".

It does speak to the need for PSP requirements though. Otherwise one cannot say whether CFE_PSP_GetTime is behaving within requirements or not here, or be able to quantify if a slight loss of precision for a faster implementation is "worth it" or not. It is a judgement call for the platform's users to make.

I suggest we take an approach similar to OSAL, where the API and behavior gets documented. Basically this is what the code is expected to do and why, but allow projects the freedom to make the judgement calls (avoid expressing "requirements" that aren't really "required" without additional project context, like a timer shall be accurate to Xus or similar). The PSP's we provide as part of the open source cFS Framework are example implementations, so I'd prefer to scope our effort to describing the choices made vs trying to generate requirements that work for all use cases.

This is also an area that IMO could use clean up...

Agreed, but I'd consider it pretty low priority (at least from GSFC point of view) since no future projects have targeted this PSP for use. I certainly encourage a ticket to document clean-up opportunities/suggestions, but I'd consider it a low priority enhancement (unless funded I likely won't focus resources on it).

Pedantic note: you really shouldn't need the "u" suffix on literals in the code, because the DecCount is already a uint32, not an int32.

Agreed! I found my bug. I was defining DecCount as uint32_t, but then I used an iterator to check accuracy and defined it as int. Good catch.

jphickey · 2019-10-29T13:44:51Z

This is also an area that IMO could use clean up...

Agreed, but I'd consider it pretty low priority (at least from GSFC point of view) since no future projects have targeted this PSP for use. I certainly encourage a ticket to document clean-up opportunities/suggestions, but I'd consider it a low priority enhancement (unless funded I likely won't focus resources on it).

The only reason I bring it up is because the issue is not confined to this particular PSP. If only this PSP had to deal with it, then things would be fine. But its an area where the abstraction is lacking, which in turn increases complexity in code on top of it (specifically CFE_TIME which uses this API) which IMO is the real issue.

The CFE_TIME has all sorts of complexity to handle the frequent wraparound because the PSP isn't actually handling it, its just passing basically a "raw" value up the stack. There are other related PSP calls - see CFE_PSP_Get_Dec, CFE_PSP_Get_Timebase, CFE_PSP_GetTimerLow32Rollover, etc. that are related to the insufficient abstraction, but I couldn't tell you what they are for or how one is supposed to implement them on other platforms.

The CFE_TIME has way too much complexity and there are a number of bugs against this. Fixing this should entail simplifying and bringing some sanity/consistency to the PSP time access functions, which would enable much of the complex wraparound logic to be removed from CFE_TIME, which really exists there only for this platform - other platforms don't have this wraparound issue.

Sorry if this is a diversion from the original ticket, this is a longer term fix but it should be on the to-do list somewhere because it will greatly improve CFE_TIME and other apps that need timing info.

skliper · 2019-10-29T13:54:51Z

... it should be on the to-do list somewhere because it will greatly improve CFE_TIME and other apps that need timing info.

Definitely agree. The following tickets are in at the cFE level under @acudmore review, addressing these and your comments above from an architectural design level is definitely on the to-do list. A review of abstraction across the entire cFS for time makes sense to me. That said, I don't have identified resources to focus on it this FY.

nasa/cFE#123
nasa/cFE#109
nasa/cFE#78

skliper · 2019-10-29T17:07:32Z

I did push a branch, but didn't do any associated testing. Really not all that useful without doing all the tests, but at least an example of what could be done.

Fix #76, Microsec conversion lost resolution

skliper self-assigned this Sep 25, 2019

skliper added the bug Something isn't working label Sep 25, 2019

skliper removed their assignment Sep 26, 2019

CDKnightNASA mentioned this issue Jul 22, 2019

switch PSP to use OSAL timers rather than itimers #1

Closed

skliper added a commit that referenced this issue Oct 29, 2019

Fix #76: Microsec conversion lost resolution

0f10c67

skliper mentioned this issue Apr 2, 2020

Fix #76, Microsec conversion lost resolution #147

Merged

astrogeco added a commit that referenced this issue Apr 21, 2020

Merge pull request #147 from skliper/is76-usec-conversion

c8bde97

Fix #76, Microsec conversion lost resolution

astrogeco mentioned this issue Apr 27, 2020

Integration Candidate 2020-04-15 #155

Merged

astrogeco closed this as completed in 6ddf6fd Apr 27, 2020

skliper added this to the 1.5.0 milestone Jun 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CFE_PSP_GetTime Loses Resolution in Conversion to Microseconds (GSFC DCR 14952) #76

CFE_PSP_GetTime Loses Resolution in Conversion to Microseconds (GSFC DCR 14952) #76

skliper commented Sep 25, 2019

skliper commented Sep 25, 2019

skliper commented Oct 2, 2019

jphickey commented Oct 28, 2019

skliper commented Oct 28, 2019

jphickey commented Oct 28, 2019

jphickey commented Oct 28, 2019

skliper commented Oct 28, 2019

jphickey commented Oct 28, 2019

skliper commented Oct 28, 2019

jphickey commented Oct 28, 2019

jphickey commented Oct 28, 2019

skliper commented Oct 28, 2019

jphickey commented Oct 29, 2019

skliper commented Oct 29, 2019

jphickey commented Oct 29, 2019

skliper commented Oct 29, 2019

skliper commented Oct 29, 2019

CFE_PSP_GetTime Loses Resolution in Conversion to Microseconds (GSFC DCR 14952) #76

CFE_PSP_GetTime Loses Resolution in Conversion to Microseconds (GSFC DCR 14952) #76

Comments

skliper commented Sep 25, 2019

skliper commented Sep 25, 2019

skliper commented Oct 2, 2019

jphickey commented Oct 28, 2019

skliper commented Oct 28, 2019

jphickey commented Oct 28, 2019

jphickey commented Oct 28, 2019

skliper commented Oct 28, 2019

jphickey commented Oct 28, 2019

skliper commented Oct 28, 2019

jphickey commented Oct 28, 2019

jphickey commented Oct 28, 2019

skliper commented Oct 28, 2019

jphickey commented Oct 29, 2019

skliper commented Oct 29, 2019

jphickey commented Oct 29, 2019

skliper commented Oct 29, 2019

skliper commented Oct 29, 2019