-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarification of handling FMUs with constant step size and final step when rounding errors occur #575
Comments
(reply from: Jean-Philippe Tavella) Dear Andreas, According to my experience, it is possible to violate the EndTime condition, especially with FMUs exported from Dymola. Even when the fmi2SetupExperiment() primitive is used by the Master with the parameter stopTimeDefined set to fmi2True. I don't think there is anywhere in the standard a clause that prohibits a master to call doStep() outside the StartTime/StopTime interval. Jean-Philippe Tavella |
(reply from: Martin Sjölund) time = startTime + i * stepSize Or: time = time + stepSize I usually calculate it as the former to avoid accumulating numerical errors. The following Python code shows why: >>> i = 0
>>> for j in range(0,100): i = i + 0.01
...
>>> i
1.0000000000000007
>>> 0 + 100.0 * 0.01
1.0 |
@martin: in my code I use the variant to add stepsize to time (same code for constant and variably spaced intervals). In any case, the solution of using So, I guess FMUs must indeed handle these situations nicely without complaining in the last step. |
Should't the master needs to be sure before (starting computation) that stepSize and StopTime upon all FMUs are set as neccessary not to violate any of the above mentioned constrains? |
Regular Design Webmeeting: Karl: In the FMI 2 Standard it is defined that the FMU has to synchrinize its time to the time defined by the master. Klaus: I will check the FMI 2 text if we need something to clarify |
Not sure if this is related, but the FMU
from the test suite appears to have a problem related to step sizes. At least, the following output is generated by the FMU:
with simulation time limits in the opt file:
With StepSize = 0.001 the error occurs, with 0.0001 the simulation runs fine to the end. Strange... (seems like an internal error in the FMU, maybe related to a check like t+dt > t_end --> error). -Andreas |
My suggestion is that the FMU implementation checks for the stopTime according to some tolerance (maybe the tolerance given at the setupExperiment, normalized to the time scale). |
Would be one option - or simply using stopTime for evaluating time-dependent quantities (time series data etc.) for any t > stopTime. We should, however, make this clear in the standard. @jean-philippe: as you said, there is no clause to prevent the Master to exceed the end time (a little) in a doStep() call. Couldn't we then just add this to the compliance checker and let it purposefully overstep the end time a little, just to check if the FMU handles this gracefully? |
I think there is.
|
IMHO: that's a bad idea (see rounding discussion errors above). Example: say the built-in stop time is some value x, and when writing the modelDescription file the number gets rounded (due to limited number precision) to be slightly larger than x. Now, the Master has no other information as to run to this slightly larger value, and hence to run past the built-in stop time = booom! So, I'm very much in favor to drop such a clause from the upcoming standard and instead require FMUs to work well (even though some of the tool vendors need to do some work on their end to create well-behaving FMUs). |
Which seems like a mistake, since it suddenly requires error checking by the FMU, i.e. the callee, which we usually don‘t. And for an FMU that does not care about the stopTime (i.e. most FMUs) this means it must actively do something with the stopTime, which is again stupid. I think this sentence must go, and be replaced by a sentence that states that it is an error if the environment exceeds stopTime. I also think it should be the burden of the environment to ensure that the values of currentCommunicationPoint+stepSize do not exceed the stopTime: It is in full control of both, so it can ensure that the stopTime is reached exactly (by choosing e.g. an appropriate stopTime, or just not setting one in the first place). I don’t see the need for the FMU to do any kind of fudging... And we already make people aware of the fact that there can be numerical deviations between the internal times if the FMU and what is passed in as currentCommunicationPoint... |
Then the FMU generator is broken and should be fixed. I’m sorry, but in 2020 there is no reason to have faulty floating-point printer/reader implementations. Both IEEE754 and correct algorithms have been available for >30 years... |
I agree with this as well. However, in the same paragraph, there's the following:
This text would have to be changed as well, perhaps to non-normative text. Furthermore, I would expect that if the FMU supports fixed step size, both master and FMU will accumulate the time in a way that prevents the StopTime from being exceeded. This means that the FMU has to admit the possibility of numerical errors, and use the expressions when checking if the current time has exceeded the stopTime. For instance
where @t-sommer I see that you wrote this part, are there actual usages of this feature? |
I don't think @t-sommer wrote this, this is straight from the 2.0 standard text, hence probably the fault of someone else ;). I still don't see why the FMU should be doing anything special for stopTime: The master has complete control over its choice of stopTime and its time calculations, and hence can ensure it does not exceed stopTime. Why should the FMU care? And the FMU already has to deal with discrepancies of the master time calculation, vs. its internal one, which is why the master communicates its idea of the current time in each doStep call (and there is text in the standard that addresses this). What am I missing here? |
Pierre, you are of course right, but I'm having engineering users and engineering tool developers in mind, as quite a relevant target group for the FMI standard. From my experience, on average we cannot expect the same level of expertise that some of the FMI members have, so my preference is a robust and as much as possible human-error-tolerant standard. An example for stopTime problems: in building energy simulation, duration of simulation cases easily exceeds several years, expressed in seconds this gives large numbers. Most FMI generators (including modelica-based tools) write stop time in seconds using general/scientific notation, with 5/6 digits precision. So much for the IEEE754 specs applied in the field :-) Maybe within automotive use cases this is not an issue (because durations are rather short) - with building energy and HVAC simulation models, well, we got problems... From my point of view, the most robust solution would be:
This is IMHO a robust solution which helps to avoid any critical error aborts at the end of the simulation. These could be nasty in all cases, when such an abort causes data from the simulation run to be lost, ie. when master or FMUs only write their results after successful finish of simulation. Generally, the last integration interval should not be overrated, since from my experience, the interesting stuff of the simulation model should really happen somewhere in between start and stop time. -Andreas |
I think we can expect tool developers, i.e. developers of tools implementing numeric algorithms to have a basic understanding of IEEE754. We expect them to implement or at least use solvers responsibly, which requires far more understanding of numeric algorithms and their pitfalls than just using a suitably correct printer/reader library. Or in other words, how would a developer who does not care about reading and printing accuracies then suddenly implement useful "fault-tolerant" behavior for unknown to him rounding issues of other implementations? The worst that one sees in numeric algorithms is random epsilons being applied to gloss over fundamental problems, so that stuff suddenly "works".
Then we (the users) should file bugs against those implementations, if they do not meet our expectations. And if an FMU exporter outputs a default stop time of 3.185136e9, but does not accept stepping to the IEEE754 double precision floating-point number that this corresponds to under IEEE754 conversion rules (using round-to-even), then again this is a bug that needs fixing. What slightly confuses me is that you seem to be treating the stopTime in the modelDescription.xml element as anything other than a useful default value: Nothing at all prevents the master from chosing a stopTime of its own design that it is sure it can hit, using whatever algorithm it uses to calculate its communication points. And if it ensures that this is less than or equal to the default time, it would seem probable that the FMU will support this.
How is this robust? Robust against what? How does a co-simulation master rely on this behavior, and why should it?
Why not fix the stopTime instead? This surely seems to be the better behavior? And why should they "try" to stick to it, when they very easily can? I.e. it seems to me I'm missing something here.
It seems to me that as a quality of implementation issue I'd file a bug against all implementations that intentionally throw away data just because an error occurred. But I think this is unrelated to this issue.
Then why insist on having a stopTime that does not match your communication points? Again FMUs and masters must solve the much harier issue of communication points being calculated in different ways resulting in potentially different times (hence the master gives its idea of the current time rendundantly in doStep), while the stopTime is fully under control of the master, so I don't see any new problems being created by this that cannot be solved by the master itself. |
Besides hoping that someone involved remembers the rationale at the time for this and chimes in on here? Probably not. You might want to look through the meeting minutes in the years prior to the 2.0 release, but whether they will contain something edifying? That said, I don't think there is much mistery as to a suitable rationale: The stopTime argument to the fmi2SetupExperiment call is an optional mechanism, that allows the FMU to a) perform various optimizations and b) check beforehand whether the model is valid for the planned simulation time. IF a master chooses to set a stopTime, it can be expected to then adhere to its self-chosen stopTime. Stepping beyond that time is considered an error, and in 2.0 the FMU is required to report it as such. The master has full control over it, so why should it bungle this? Either set no stopTime, or set a stopTime that you can hit or stay below. If for unfathomable reasons you need to fudge here, just set the stopTime to a greater value that you will not exceed. I don't agree with the mandatory checking by the FMU, since we generally don't require that kind of mandatory checking from calllees. And since most FMUs will not care about the stopTime at all, making them check seems excessive. But other than that I don't see a specific problem with this mechanism. |
Agreed. I'll think about this and will put up a proposal on how to add such a check to the fmi compliance checker. Though, we do not yet have a mechanism to forcefully cause an FMU to abort and require this abort as "correct behavior". Since there is currently no unique return code/info mechanism formulated, to identify the exact reason for failure, this may be tricky... (but unrelated to this topic).
@pmai Indeed, it looks like I've missed something here. What I was aiming at was the following scenario:
If I understand you right, this would be considered a fault on the Master's side, correct? There are two options to fix that in the master:
Are we on the same page, now? -Andreas |
Yes, I would consider this to be a fault of the master algorithm.
Four things I'd add to that:
|
We have introduced a few things since this ticket started to address this point:
We could add a sentence about how not to add up h with multiple additions and use n*h instead to avoid numeric dirt. Or add a link to Goldberg: https://dl.acm.org/doi/abs/10.1145/103162.103163. Closing this ticket, please reopen and add PR if you think there is something missing in the standard document. |
I have a question about the correct handling of CoSim-FMUs that cannot use variable communication step sizes.
Example: StepSize = 0.01 s, StartTime = 0, StopTime = 1
Last communication interval would run due to rounding errors from 0.990000001 to 1.000000001. This, however, gives an exception in an FMU about "time out of range" (obviously when evaluation a parameter time series at t=1.00000001 where the data ends excactly at 1).
What would be the correct/expected behavior (in the sense of cross-checking rules) for a co-simulation master?
Thanks for clarification on the matter,
Andreas
PS: is there anywhere a clause that prohibits a master to call
doStep()
outside the StartTime/StopTime interval?The text was updated successfully, but these errors were encountered: