New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rrd4j] Persistence warnings #15220
Comments
This issue has been mentioned on openHAB Community. There might be relevant details there: https://community.openhab.org/t/openhab-4-0-milestone-discussion/145133/539 |
I see this warning but the Items are not persisted every second. In my most recent instance I saw
The events in question from a full minute before and after the two changes to that Item.
I've added a spaces to highlight the two events for the Item in question. The warning indicates that the second event is the one that was rejected. Things of note:
This is pretty consistent for me. In another caseI have an warning for an Item that was changed almost exactly two seconds apart. In this case the events occurred at 23:00:05 and 23:00:07. It threw out the second one but the timestamp for that event is 23:00:05 (this=1688965205) and the last update is 23:00:07 (last update=1688965207). Looking at more it seems like it's somehow swapping the timestamps for the previous event and the latter event and rejecting setting the data point because of that. Could there be a TOCTOU bug or maybe the variables are swapped or something like that? I've reviewed about half a dozen of these and they seem to all follow this similar pattern. The update is rejected but the reported timestamp is wrong in the rejection message and considering rounding the timestamps are swapped (i.e. the "this" timestamp is for the previous event and the latest update timestamp is for the current event). |
Can you set |
Done and amazingly enough an example just happened immediately after changing the log level. :-D openhab.log
relevant events.log
There's almost three minutes between the two events this time. Unfortunately I didn't turn on the logging before 11:29 so I don't have the logs from when that change to 71.85 occurred to tell me why it took so long to save. I'll continue to monitor the logs and post when I have something more complete. Based on observations, I'm wondering if the problem is if a change occurs within a second of the everyMinute strategy save. That seems to be what happened in this particular case. I'll continue to monitor. Edit: It happened again and indeed again it was within a second of the everyMinute dump of Item states. |
What persistence strategy do you have? is to "everyUpdate, everyMinute"? |
Yes.
I use MapDB for restoreOnStartup so I created a .persist file. Based on the docs everyMinute/everyChange matches the default saving strategies which is why I used both. I wanted the default and to just disable restoreOnStartup. |
I guess that might be the issue (and I have no good idea how to solve that). When "everyChange" is triggered at second 0 of a minute, they may be scheduled with the same timestamp. |
Does rrd4j save the everyMinute as a single transaction? It kind of looks like that might be the case and it makes sense that it would but it's hard to tell. If so, would it make sense to adjust the timestamp on events that come in during that transaction so it's not before the most recently saved record? At least with my warning entries and watching the debug logs it's looking like new events are blocked from being saved while the everyMinute is running. By the time the everyMinute is done the timestamp for the new event is now in the past. I'm sure it's worthy of a debate, but it seems like it would be less wrong to save the value with an off by one or two seconds timestamp than it is to throw away the value entirely meaning it won't actually be saved to the database (assuming it doesn't change in the mean time) until one minute later. |
Behind the scenes there is a scheduled job in core that calls the persistence service for each item that has the time-based strategy defined. This happens very quickly (probably faster than the persistence service is able to store them, because that happens sequentially) and if a large number of items is persisted, each of them is scheduled for storage as a single job. If an update occurs, it is also scheduled for storage, most likely after all time-based. That may explain the difference in the timestamps, opening 100 files, checking them, writing the new data and closing them might take quite some time. |
Based on the logs, I can confirm that it takes about 2.5 seconds on my system to complete the everyMinute. It seems to take between 2 to 20 msec per Item. I've about 300 Items that gets saved to rrd4j. There are some 100+ msec gaps here and there too. |
This has always been the case, but the event handler was blocked until a write completed. This is why there were less (or no) collisions. |
One solution would be to use the timestamp when the data is really written to the file instead of the timestamp when |
I personally would be happy with that. A few seconds this way or that is better than just short of a whole minute. |
Since recent persistence optimization by @J-N-K I'm seeing a lot of messages like the one below and so do several others on the forum. Running S3515.
I think it happens when I update(+persist) an item multiple times in one second.
To be clear, it's okay that this will only persist one value per second.
But the remaining updates should not result in WARN level output so at least the log level should be lowered to DEBUG.
The text was updated successfully, but these errors were encountered: