-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements to slotting conversions / TimeInterpreter #1943
Conversation
c8b3323
to
8f756c0
Compare
9392ea0
to
1d99265
Compare
f580554
to
3ade904
Compare
f928fd0
to
a2d93ed
Compare
8539d1a
to
a05514b
Compare
a2d93ed
to
563dc47
Compare
a05514b
to
11f35b4
Compare
563dc47
to
376376a
Compare
cea9eb0
to
a1cc8c6
Compare
ae685c2
to
1e232f8
Compare
Left e -> do | ||
traceWith tr $ MsgInterpreterPastHorizon (pretty query) e | ||
throwIO e | ||
liftIO $ traceWith tr $ MsgInterpreterPastHorizon (pretty query) e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MsgInterpreterPastHorizon
shouldn't be critical anymore, as it might be expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I lowered to Info
and removed the log line about it being unexpected.
bors try |
tryBuild succeeded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is so much unsafeRunExceptT
here. It makes our code look unsafe.
This is replacing an actual exception type PastHorizonException
with userError
and a string message, which is worse.
Why not let all the time/slot conversion functions throw PastHorizonException, wherever unsafeRunExceptT is currently used. In the other "safe" cases, catch the exception.
Alternatively, replace unsafeRunExceptT
with a new util runNonFailingQuery
(or some name like that). This will be basically runNonFailingQuery q = q >>= either throwIO pure
. The advantage of this is that it's obvious where the exception could occur, and userError is not used.
And also we should sprinkle HasCallStack liberally - otherwise tracking down errors will be hard.
@@ -1024,4 +1037,4 @@ instance HasSeverityAnnotation (NetworkLayerLog b) where | |||
MsgWatcherUpdate{} -> Debug | |||
MsgChainSyncCmd cmd -> getSeverityAnnotation cmd | |||
MsgInterpreter{} -> Debug | |||
MsgInterpreterPastHorizon{} -> Critical | |||
MsgInterpreterPastHorizon{} -> Info |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this change really sound 🤔 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the reasoning is that we have actually some cases where the PastHorizonException is "normal" and somewhat expected.
But perhaps we can divide slot/time conversions into two categories:
- Where it's our bug if the conversion fails.
- Where any conversion errors are because of the user inputting a date which is beyond the safe zone. So an error not a bug.
On launch, the TimeInterpreter may not have been fetched from the node. Instead returning the singleEraInterpreter for the first era, it seems safer to block until fetched. I imagine there could be race conditions where we would sometimes return completely wrong time data when in Shelley, just after starting the node. Not completely sure, but I hope there shouldn't be any drawbacks with blocking.
15ec465
to
351a1b1
Compare
bors try |
tryBuild failed |
If the node is not enough in-sync, we cannot know them.
bfeefe5
to
78561c1
Compare
bors try |
But rather respond with NotResponding.
It fails without the recent fix
Instead, we do not push the exception down to every caller but rather, throw it as an exception in the network layer. The rationale is that, this exception can only occur when both conditions are met: a) The node is still syncing and doesn't yet know about any hard-fork. b) A time beyond the node's foreseeable future is queried. While syncing in Byron these two conditions can't be met (times referenced in blocks are neccessarily before the node's tip and can't be beyond its foreseeable future. There's the case of delegation certificates and or transaction TTL but these only exists in Shelley, where the foreseeable future is so far, unlimited. Yet, there are points in the API where a time that is far beyond the node's tip can be provided and that is: - As filtering parameter when listing transactions. - As current time when looking at network parameters.
tryBuild succeeded |
78561c1
to
c46082b
Compare
Waited for the "trying" branch to pass. Then pushed a little update that reshaping the git history and, lowering the "PastHorizon" error from
|
Issue Number
#1869 / #1960
Overview
singleEraInterpreter
if we don't yet have a properTimeInterpreter
.TimeInterpreter IO
withTimeInterpreter (ExceptT IO ErrPastHorizon)
endTimeOfEpoch e
instead oftimeOf =<< firstSlotInEpoch (e+1)
. This is needed to stay inside the forecast range when showingnext_epoch
when the node is in-sync, in Byron. Fixes GET http://localhost:8090/v2/network/information returns 500 on mainnet while wallet is syncing through Byron era. #1960next_epoch
andnetwork_tip
optional in the API. When the node is still syncing the Byron chain, it cannot know any slotting info close to the current time. Solving related problem to GET http://localhost:8090/v2/network/information returns 500 on mainnet while wallet is syncing through Byron era. #1960TimeInterpreter
Comments