Streaming adr #1157

cardenaso11 · 2023-11-08T10:27:03Z

An ADR on improving the internal interfaces used in hydra-node to allow for better extension points to consume and produce StateChanged events.

CHANGELOG updated or not needed
Documentation updated or not needed
Haddocks updated or not needed
No new TODOs introduced or explained herafter

ch1bo

I like the context you give and most of the rationale for this aligns with ADR25.

I left some comments to prompt further detailing on the decision section. It would be great if you could respond on why some of the proposed changes are needed for your project so we can distill that into the key decision points.

ch1bo · 2023-11-08T14:10:42Z

docs/adr/2023-11-07_029-streaming-persistence.md

+
+- Client must be flexible and ready to handle many different events
+
+- There is no way to simply hand off transactions to the hyrda-node currently, a full connection must be initiated before observed transactions can be applied, and bulky JSON objects can slow down "bulk" operations


Note, there is a way to connect to the websocket without receiving the full history (at least) by u sing history=0 query parameter.

ch1bo · 2023-11-08T14:12:02Z

docs/adr/2023-11-07_029-streaming-persistence.md

+- Many applications, like SundaeSwap's Gummiworm Protocol, do not need need the full general API, and benefit from any performance improvement
+
+- The challenge of subscribing to event types is complex to handle, but would be applicable to many parts of Hydra, not just for subscribing to new transactions. It's a good fit for passing stuff off to a message queue (MQTT, Kafka, whatever), probably from a dedicated process (see "Chain Server" below)
+    - Could also just directly use a "compatible" spec like STOMP or MQTT, but that would lock in implementation details


I think ADR25 is a related draft proposal.

ch1bo · 2023-11-08T14:13:21Z

docs/adr/2023-11-07_029-streaming-persistence.md

+    - This mock chain was used to mock the layer L1 chain, not the L2 ledger
+- Attempt in February 2023 to externalize chainsync server as part of [#230][#230]
+    - Similar to [#119][#119], but the "Chain Server" component in charge of translating Hydra Websocket messages into direct chain, mock chain, or any hypothetical message queue, not necessarily just ZeroMQ
+    - Deemed low priority due to ambiguous use-case at the time. SundaeSwap's Gummiworm Protocol would benefit from the additional control enabled by the Event Server


Why are these two points relevant? The chain layer is mostly unrelated to the Client API.

ch1bo · 2023-11-08T14:16:34Z

docs/adr/2023-11-07_029-streaming-persistence.md

+
+A new abstraction, the EventServer, will be introduced. Each internal hydra event will be sent to the event server, which will be responsible for persisting that event and returning a durable monotonically increasing global event ID. Different implementations of the event server can handle this event differently. Initially, we will implement the following:
+ - MockEventServer, which increments a counter for the ID, and discards the event
+ - FileEventServer, which increments a counter for the ID, and encapsulates the existing file persistence mechanism


Persisting the event stream should not be optional as this would change the node's capabilities of not retaining Head state (and it does not compose well.. loading the state is in a completely different location part of the application than when consuming these events)

I think it should definitely be the default, and perhaps required when running online, but in offline mode, when using it as a tool in a larger component, the state is persisted / managed externally. When using it as a component to do integration testing, saving the state would slow down the tests.

That being said, so long as the infrastructural piece is here, we can maintain that capability in our fork specifically for the Gummiworm protocol.

ch1bo · 2023-11-08T14:17:59Z

docs/adr/2023-11-07_029-streaming-persistence.md

+A new abstraction, the EventServer, will be introduced. Each internal hydra event will be sent to the event server, which will be responsible for persisting that event and returning a durable monotonically increasing global event ID. Different implementations of the event server can handle this event differently. Initially, we will implement the following:
+ - MockEventServer, which increments a counter for the ID, and discards the event
+ - FileEventServer, which increments a counter for the ID, and encapsulates the existing file persistence mechanism
+ - SocketEventServer, which increments a counter for the ID, and writes the event to an arbitrary unix socket


How is this different to the websocket variant?

Namely, it could write to a unix socket in a binary format, rather than incurring the overhead of a web socket + tcp + serialization. We can also expose all internal events to the socket, which would be useful in ecosystem integrations, whereas the websocket currently exposes only a subset of events (from my understanding, I may be wrong here)

The websocket currently contains all information, even a bit more. The shape is different and additional persistence is hacky. So it's not perfect, but we should be able to unify things here.

I don't think there is much of an overhead in using WebSockets over unix domain sockets. Plus there is additional complexity of message framing on domain sockets.

It sounds like what you want is rather a more compact encoding (binary instead of JSON)? I have suggested a way to select the encoding via a content-type in ADR25

Ok; my understanding from talking with Arnaud was that there were some events that are modelled internally but not exposed to the websocket, so we were trying to provide a deeper integration without breaking the websocket API, but yea, in that case we can unify the websocket and unix socket APIs for now

ch1bo · 2023-11-08T14:21:23Z

docs/adr/2023-11-07_029-streaming-persistence.md

+    - Generalizes the UTxO persistence mechanism introduced in [offline mode][offline-mode]
+    - May be configured to only persist UTxO state periodically and/or on SIGTERM, allowing for performant in-memory Hydra usage. This is intended for offline mode usecases where the transactions ingested are already persisted elsewhere.
+    - One configuration which we expect will be common and useful, is the usage of a MultiplexingEventServer configured with a primary MockEventServer, and a secondary UTxOStateServer, which will persist the UTxO state to disk.
+ - MultiplexingEventServer, which has a primary event server (which it writes to first, and uses its ID) and a list of secondary event servers (which it writes to in sequence, but discards the ID)


Does this mean that instead of one, we might have multiple event servers configured for event consumption? Maybe this can be decided without going into too much details how a MultiplexingEventServer is structured (i.e. why compose with a mock one if you could rather say that "zero or more event servers can be registered")

The idea here was that we may want to multiplex out the results (save to disk and write to dynamodb), but we structured it this way fir two reasons:

if we have this requirement of an ID, then the IDs that different servers provide may conflict, so we need one to choose unambiguously

some of these sources may be required for ACID properties (persist to durable storage) and some may be sufficient for "fire and forget" for informational purposes (like analytics), so the secondaries were in that spirit

That being said, if we don't have the id requirement, and we don't care about the ability to do fire and forget (or we can implement that via a wrapper), then I'd be happy to simplify this

I think the event already has an ID and we can simplify things here (as commented above as well)

ch1bo · 2023-11-08T14:26:10Z

docs/adr/2023-11-07_029-streaming-persistence.md

+ - UTxOStateServer, which increments a counter for the ID, and updates a file with the latest UTxO after the event
+    - Generalizes the UTxO persistence mechanism introduced in [offline mode][offline-mode]
+    - May be configured to only persist UTxO state periodically and/or on SIGTERM, allowing for performant in-memory Hydra usage. This is intended for offline mode usecases where the transactions ingested are already persisted elsewhere.
+    - One configuration which we expect will be common and useful, is the usage of a MultiplexingEventServer configured with a primary MockEventServer, and a secondary UTxOStateServer, which will persist the UTxO state to disk.


This appears to be an oddly specific decision to be taken. But I understand this is what you actually implemented in #1118 and aim to be using in gummiworm.

Yes, exactly. I'm going to try to rephrase it to make the use-case a little more coherent. It's intended to be used to easily maintain a UTxO that can later be used as a starting UTxO

See my comment here about re-using UTxO in offline mode and why that actually should not be needed

ch1bo · 2023-11-08T14:27:48Z

docs/adr/2023-11-07_029-streaming-persistence.md

+    - One configuration which we expect will be common and useful, is the usage of a MultiplexingEventServer configured with a primary MockEventServer, and a secondary UTxOStateServer, which will persist the UTxO state to disk.
+ - MultiplexingEventServer, which has a primary event server (which it writes to first, and uses its ID) and a list of secondary event servers (which it writes to in sequence, but discards the ID)
+
+New configuration options will be introduced for choosing between and configuring these options for the event server. The default will be a multiplexing event server, using the file event server as its primary event server, and a websocket broadcast event server as its secondary event server.


I think this is the main point of the decision block.. how will the behavior of the hydra-node change from the outside? How much configurability needs to be introduced and how can the default event consumers be turned off?

ch1bo · 2023-11-08T14:29:26Z

docs/adr/2023-11-07_029-streaming-persistence.md

+slug: 29
+title: |
+  29. EventServer abstraction for event persistence
+authors: []


Suggested change

authors: []

authors: [@cardenaso11]

ch1bo · 2023-11-08T14:29:32Z

docs/adr/2023-11-07_029-streaming-persistence.md

+---
+
+## Status
+N/A


Suggested change

N/A

Draft

Quantumplation · 2023-12-05T16:46:03Z

@ch1bo @abailly after struggling to make the pieces fit for this ADR for a while, we significantly simplified it by separating the persistence changes from the websocket changes; we still plan to provide both, so expect a second ADR, but rather than trying to unify those mechanisms, we'll just make separate improvements to each.

Let me know if you want to discuss these on a call!

ghost · 2023-12-06T06:51:39Z

@Quantumplation Thanks a lot for this, I will have a look later on today and get back to you but I think it's all pretty clear.

ghost

I have left some comments, I think we should clarify the intention behind the "at-least-once" semantics.

ghost · 2023-12-06T13:34:25Z

docs/adr/2023-11-07_029-streaming-persistence.md

+
+The Hydra node represents a significant engineering asset, providing layer 1 monitoring, peer to peer consensus, durable persistence, and an isomorphic Cardano ledger. Because of this, it is being eyed as a key building block not just in Hydra based applications, but other protocols as well.
+
+One remaining difficulty in integrating Hydra into a fully productionalized software stack is the persistence model.  Currently, the Hydra node achieves durability by writing a sequence of "StateChanged" events to disk. If a node is interrupted, upon restarting, it can replay just these events, ignoring their corresponding effects, to recover the same internal state it had when it was interrupted. However, downstream consumers don't have the same luxury.


Would be good to complete this paragraph and fully motivate the change with one example of how this would be useful in production, if possible from your experience

Yes, please. I do not understand the context / motivation for this.

The decision section does (to my understanding) imply that "persistence mechanisms" would get events re-submitted upon startup of the node. This, however does sound more like we would want to have "event sinks" for the internal event stream (StateChanged + likely more stuff). That would make sense to me, especially as the Hydra.API.Server could be modeled as such an event sink.

@ch1bo That sounds like what we had before, but when we went to actually define how it would work (in particular, right now they're treated very differently; and defining the exact recovery semantics like in the example below, revealed just how different they are), it was unclear how we would actually unify the two; after talking with @abailly-iohk, he suggested a much simpler step, just focused on extending the persistence mechanism.

Let me elaborate here first, and once we reach understanding, i'll codify it in the ADR.

Currently, we want to fork the hydra node for the gummi-worm protocol, produce a stream of "archives" of the transactions we see confirmed. We have 3 ways to do this:

Build an application that connects to the websocket API, and bundle up and persist the transactions from the websocket messages we see

Build an application that tails the persistence log file on disk, and performs the relevant logic

Fork the hydra node, and replace the persistence mechanism with our own

You might imagine someone that wants to run this in a cloud native setting, where events are persisted to kinesis so they can be consumed from an AWS lambda, for example, would have a similar set of options.

However, approaches 1 and 2 have a number of downsides: they are a fully separate component to deploy, maintain, monitor, restart, etc; 2 in particular is dependent on a file format that is likely to change (at least, more so than the websocket API);

Approach 3 would be difficult to maintain long term, because as the hydra node changes underneath it, it would be more and more work to merge the upstream changes into our fork.

The idea here is to genericize the persistence mechanism so that the fork can extend it in a fairly unobtrusive way.

It's a little frustrating, because we've been going back and forth on this design for a while now, and we keep seeming to reach agreement on what would be a good thing to build while we're on a call together, and then that has diverged by the time we get the ADR up for review.

There are more aspects to your analysis of approaches. IMO approach 1 would be most favorable because it does not increase the scope to be maintained and allows for more loose coupling with a clear interface. For example, I do not see it favorable to add AWS dependencies to the hydra-node just for optional storing on S3 (or kafka deps, or ...). Not sure about a plugin mechanism either, but maybe an option.

I do understand your frustration. Maybe we should use less ambiguous modes of communication like drawing/amending an architecture diagram to clarify one or the other approach?

Or prototyping what you actually need for your solution and chat about code?

Anyhow, please see my comments as early as possible feedback and we can also merge the ADR as is.. the actual implementation will result in discussions about details and further refinement anyhow.

For example, I do not see it favorable to add AWS dependencies to the hydra-node just for optional storing on S3 (or kafka deps, or ...)

Agreed, we do not plan to add this directly to the hydra node; the goal is to make it possible to add in a fork (or through a plugin mechanism), while at the same time reducing the overall burden of keeping that fork up to date.

As it is now, we'd have to make enough changes to the hydra code base that each time you touched the persistence mechanism at all, we'd have to go through a potentially painful merge process.

By having this mechanism, us (and other users) would simply be implementing a standard interface in the fork, and the chance of merge conflicts is much less drastic.

ghost · 2023-12-06T13:37:28Z

docs/adr/2023-11-07_029-streaming-persistence.md

+
+One remaining difficulty in integrating Hydra into a fully productionalized software stack is the persistence model.  Currently, the Hydra node achieves durability by writing a sequence of "StateChanged" events to disk. If a node is interrupted, upon restarting, it can replay just these events, ignoring their corresponding effects, to recover the same internal state it had when it was interrupted. However, downstream consumers don't have the same luxury.
+
+We propose generalizing the persistence mechanism to open the door to a plugin based approach to extending the Hydra node.


This sentence seems more relevant to the Decision section. I would suggest replacing it with concrete examples.
eg.

For example, a hydra-node deployed as part of a larger system on k8s could want store it's event stream using some persistence service afforded by the stack it is part of instead of plain, "naive", disk-based persistence. This would allow scenarios like graceful shutdown and restart across redeployments, or redundancy of hydra-node-based services through a primary-secondary architecture.

ghost · 2023-12-06T13:39:14Z

docs/adr/2023-11-07_029-streaming-persistence.md

+
+# Decision
+
+We propose adding a "persistence combinator", which can combine one or more "persistenceIncremental" instances.


Suggested change

We propose adding a "persistence combinator", which can combine one or more "persistenceIncremental" instances.

We propose adding a `CombinedPersistence` mechanism, which can combine one or more `PersistenceIncremental` instances.

ghost · 2023-12-06T13:41:10Z

docs/adr/2023-11-07_029-streaming-persistence.md

+
+When appending to the combinator, it will forward the append to each persistence mechanism.
+
+As the node starts up, as part of recovering the node state, it will also ensure "at least once" semantics for each persistence mechanism. It will maintain a notion of a "primary" instance, from which events are loaded, and "secondary instances", which are given the opportunity to re-observe each event.


This is not entirely clear to me what you mean here, perhaps a diagram (sequence?) would help understanding how would that work? There's both the append and the loadAll part to consider.

Imagine we have a setup that uses the CombinedPersistence mechanism to:

save the StateChanged to the log file

builds up a larger archive of transactions (say, 10,000 transactions at a time), and then saves them to S3

publishes a message for each StateChanged on kinesis

Now imagine we write the StateChanged event to disk, and crash when saving to S3. On startup, we will load all events from StateChanged and replay them against the hydra state, recovering our place. However:

the S3 destination wouldn't see the StateChanged event, and would lose progress on the most recent archive it was producing; downstream consumers would see a gap in the stream

The kinesis stream would also not see that last StateChanged event, and would also see a gap.

Perhaps, you might suggest, you solve this by reversing the order of these sinks: we save the event to each of the other sinks, and the persistence on disk last. Now, when we recover, the state changed has never happened according to the node, but we've emitted it to S3 and kinesis, letting downstream actors react to state that the node then might not consider final.

So, if we build this combinator, it needs to not just feed the StateChanged events to the hydra state on startup (the outcomes of which get thrown away), but also to the other PersistenceIncremental instances. This gives an "at least once" delivery semantics for each PersistenceIncremental instance; This allows it to recover its partial state, and upgrade that to "exactly once" delivery semantics if the underlying medium support it (for example, checking if a file exists in S3, keeping track of it's own delivery cursor, etc.)

ghost · 2023-12-06T13:43:20Z

docs/adr/2023-11-07_029-streaming-persistence.md

+
+Each persistence mechanism will be responsible for it's own durability; for example, it may maintain its own checkpoint, and only re-broadcast the event if its after the checkpoint.
+
+The exact implementation details of the above are left unspecified, to allow some flexibility and experimentation on the exact mechanism to realize these goals.


Fair enough :) But at least the ADR should be clear about the overall behaviour of the combination of persistence: Is it "transactional"? probably not as we mentioned "at least once" but then some persistence backends might become unreliable and we need to clarify what they should expect upon reloading?

ghost · 2023-12-06T13:43:38Z

docs/adr/2023-11-07_029-streaming-persistence.md

+
+## Consequences
+
+Here are the consequences we forsee from this change:


Suggested change

Here are the consequences we forsee from this change:

Here are the consequences we foresee from this change:

ghost · 2023-12-06T13:44:02Z

docs/adr/2023-11-07_029-streaming-persistence.md

+- The default operation of the node remains unchanged
+- Projects forking the hydra node have a natively supported mechanism to extend node persistence
+- These extensions can preserve robust "at least once" semantics for each hydra event
+- Sundae Labs will build a "Save transaction batches to S3" proof of concept extension


ch1bo · 2023-12-06T16:13:21Z

docs/adr/2023-11-07_029-streaming-persistence.md

+
+When appending to the combinator, it will forward the append to each persistence mechanism.
+
+As the node starts up, as part of recovering the node state, it will also ensure "at least once" semantics for each persistence mechanism. It will maintain a notion of a "primary" instance, from which events are loaded, and "secondary instances", which are given the opportunity to re-observe each event.


If "secondary instances" are not even used to load anything.. are these even "persistence mechanisms"?

I do understand this idea rather as an "event sink".

Practically this would mean that instead of piggy-backing the append interface, we would want the node to submit events to zero or more event sinks.

I assume the event in question is what is called StateChanged today + any information we currently model as Effect (at least ClientEffects). That is, any information which you could observe internally on the persisted state file + what is sent on the websocket today (ServerOutput).

Note that the term is heavily overloaded, but let's postulate with such a data Event.

A useful design IMO would be to have an

data EventSink m = EventSink { process :: [Event] -> m ()

handle, where the node has a list of those:

data HydraNode m = HydraNode { ..., eventSinks :: [EventSink m] }

and stepHydraNode (it's main loop) would process them in something like:

-- [...] outcome <- logic -- Not all hydra logic outcomes may yield and event deriveEvents outcome $ \events -> forM_ eventSinks $ \EventSink{process} -> process events

That way, we could have multiple implementations for EventSink. I read sending things off to S3 or through some IPC into a scrolls filter.

We could even model the current API data Server = Server { sendOutput :: ServerOutput -> m () } as such an event sink.

OTOH.. not sure if we want to really implement those sinks in the hydra-node package at all? Maybe having one generic event stream exposed on some socket to which other processes could attach and consume event messages is more flexible?

(I do believe that websockets using binary encoded data is a good candidate for such a socket)

As I mentioned above, in a conversation with @abailly-iohk, we were talked out of this "EventSink" idea, for two reasons:

the "StateChanged" events and other outcomes were handled so differently, so unifying them seemed awkward

the semantics of how the replay after a crash works, in that case, get very confusing (see my comment with an example scenario above)

I'm happy to implement that if we're circling back to it, but I'm starting to feel like a ping-pong ball between you and Arnaud 😅

ch1bo · 2023-12-12T16:50:02Z

Rewrote several parts based on our discussion today. Also, here is a half-way updated architecture diagram I scribbled together (red stuff is new):

Quantumplation · 2023-12-13T15:53:02Z

I like the revision; splitting "EventSource" and "EventSink" makes it pretty clear what the distinction is.

To me, the only open question is, what are the es that the EventSource and EventSink are producting/consuming? Is it the outcome type, or the StateChanged type, or some refactor of those types?

ghost · 2023-12-13T16:54:21Z

To me, the only open question is, what are the es that the EventSource and EventSink are producting/consuming? Is it the outcome type, or the StateChanged type, or some refactor of those types?

I think it can be left out of the ADR, and the implementation will be telling.

Implementing a first version with only the StateChanged will be straightforward, because there won't be any filtering involve should one want to restore a node from an event stream.
Then I suspect opportunities for factoring and simplifying the node's logic will surface, having all the Outcome be part of the event stream, and filter according to the usage. This should allow us for example to change how the Server receives its event.

ghost

Left some comments before I go on holidays, seems like we are getting there

ghost · 2023-12-13T16:56:51Z

docs/adr/2023-11-07_029-event-source-sink.md

+
+* We realize that the `EventSource` and `EventSink` handles, as well as their aggregation in `HydraNode` are used as an API by forks of the `hydra-node` and try to minimize changes to it.
+
+## Consequences


An important consequence I would add, which is also in #1213, is that it becomes necessary to version external representation of data we share with the outside world. Perhaps this is worth mentioning here?

Yes, that should be the concern of a given EventSink and is especially valuable if we hope to have EventSink and EventSource be compatible across revisions of their implementations.

ghost · 2023-12-13T16:58:55Z

docs/adr/2023-11-07_029-event-source-sink.md

+
+* The `stepHydraNode` main loop does call `putEvent` on all `eventSinks` in sequence.
+
+  - TBD: transaction semantics? what happens if one fails?


Currently, we expect the append to succeed and any exception thrown will kill the node. For the purpose of making the node restartable, this is probably good enough a guarantee?

ghost · 2023-12-13T17:01:15Z

docs/adr/2023-11-07_029-event-source-sink.md

+* Sundae Labs can build a "Save transaction batches to S3" proof of concept `EventSink`.
+* Sundae Labs can build a "Scrolls source" `EventSink`.
+* Sundae Labs can build a "Amazon Kinesis" `EventSource` and `EventSink`.
+* Extension points like `EventSource` and `EventSink` could be dynamically loaded as plugins without having to fork `hydra-node` (maybe in a future ADR).


A possibility discussed on discord would be to make the sources and sinks configurable and "generic" through a URL-based mechanism, eg. you would start a node with:

hydra-node .... --data-source file:///some/file/path --data-sink tcp://some.host:1234 --data-sink https://some.service/

Then the code of the node knows how to talk to these external components but does not control them,

I like the idea to use a URI to select & configure available source and sink implementations. I'll add a note about source & sinks could be configurable (first step) or dynamically loaded (later step), but I think we agree that we should approach this incrementally and only build what we (also @Quantumplation) actually need.

ch1bo

Specified that all loaded events on startup should be (re-)submitted to all configured eventSinks and incorporated other reviewer comments.

We likely need to update one or the other things still when implementing this, but this is a very good Draft state ADR now!

Co-authored-by: Pi Lanningham <pi.lanningham@gmail.com>

Also incorporate minor reviewer comments

Quantumplation

Looks good to me! This is largely what I was envisioning a while back, and the split between "Source" and "Sink" really solves the challenges I was having. Thanks for taking a pass at revising this!

cardenaso11 force-pushed the streaming-adr branch 2 times, most recently from 69d3cfb to 39e7606 Compare November 8, 2023 10:40

ch1bo reviewed Nov 8, 2023

View reviewed changes

ch1bo self-requested a review December 6, 2023 10:46

ghost reviewed Dec 6, 2023

View reviewed changes

ch1bo reviewed Dec 6, 2023

View reviewed changes

ch1bo self-assigned this Dec 12, 2023

ch1bo force-pushed the streaming-adr branch from c1b94ef to e9116ba Compare December 12, 2023 16:39

ch1bo requested review from Quantumplation and a user December 13, 2023 10:42

ch1bo removed their assignment Dec 13, 2023

ghost reviewed Dec 13, 2023

View reviewed changes

ch1bo approved these changes Dec 14, 2023

View reviewed changes

ch1bo force-pushed the streaming-adr branch from a5fd7a6 to f8393ec Compare December 14, 2023 14:44

cardenaso11 and others added 6 commits December 14, 2023 15:44

initial streaming ADR draft

1836531

streaming persistence ADR proposal revision

5c169a7

Narrow and specify scope of streaming streaming ADR

ca81345

Co-authored-by: Pi Lanningham <pi.lanningham@gmail.com>

Rewrite ADR29 to be about EventSource/EventSink

da760c1

Rename ADR 29

5795923

Define events to be re-submitted on startup

c0f40f4

Also incorporate minor reviewer comments

ch1bo force-pushed the streaming-adr branch from f8393ec to c0f40f4 Compare December 14, 2023 14:44

Fix authors

526103f

Quantumplation reviewed Dec 14, 2023

View reviewed changes

Add Pi to authors

d96faca

ch1bo merged commit 5d1593a into cardano-scaling:master Dec 14, 2023
17 of 18 checks passed

Quantumplation deleted the streaming-adr branch December 14, 2023 19:13

ch1bo added this to the 0.15.0 milestone Dec 22, 2023

ch1bo mentioned this pull request Feb 26, 2024

Streaming Plugins #1325

Closed

3 tasks


		- Client must be flexible and ready to handle many different events

		- There is no way to simply hand off transactions to the hyrda-node currently, a full connection must be initiated before observed transactions can be applied, and bulky JSON objects can slow down "bulk" operations


		The Hydra node represents a significant engineering asset, providing layer 1 monitoring, peer to peer consensus, durable persistence, and an isomorphic Cardano ledger. Because of this, it is being eyed as a key building block not just in Hydra based applications, but other protocols as well.

		One remaining difficulty in integrating Hydra into a fully productionalized software stack is the persistence model. Currently, the Hydra node achieves durability by writing a sequence of "StateChanged" events to disk. If a node is interrupted, upon restarting, it can replay just these events, ignoring their corresponding effects, to recover the same internal state it had when it was interrupted. However, downstream consumers don't have the same luxury.


		One remaining difficulty in integrating Hydra into a fully productionalized software stack is the persistence model. Currently, the Hydra node achieves durability by writing a sequence of "StateChanged" events to disk. If a node is interrupted, upon restarting, it can replay just these events, ignoring their corresponding effects, to recover the same internal state it had when it was interrupted. However, downstream consumers don't have the same luxury.

		We propose generalizing the persistence mechanism to open the door to a plugin based approach to extending the Hydra node.


		# Decision

		We propose adding a "persistence combinator", which can combine one or more "persistenceIncremental" instances.

	We propose adding a "persistence combinator", which can combine one or more "persistenceIncremental" instances.
	We propose adding a `CombinedPersistence` mechanism, which can combine one or more `PersistenceIncremental` instances.


		When appending to the combinator, it will forward the append to each persistence mechanism.

		As the node starts up, as part of recovering the node state, it will also ensure "at least once" semantics for each persistence mechanism. It will maintain a notion of a "primary" instance, from which events are loaded, and "secondary instances", which are given the opportunity to re-observe each event.


		Each persistence mechanism will be responsible for it's own durability; for example, it may maintain its own checkpoint, and only re-broadcast the event if its after the checkpoint.

		The exact implementation details of the above are left unspecified, to allow some flexibility and experimentation on the exact mechanism to realize these goals.


		## Consequences

		Here are the consequences we forsee from this change:

	Here are the consequences we forsee from this change:
	Here are the consequences we foresee from this change:


		* We realize that the `EventSource` and `EventSink` handles, as well as their aggregation in `HydraNode` are used as an API by forks of the `hydra-node` and try to minimize changes to it.

		## Consequences


		* The `stepHydraNode` main loop does call `putEvent` on all `eventSinks` in sequence.

		- TBD: transaction semantics? what happens if one fails?

Streaming adr #1157

Streaming adr #1157

Conversation

cardenaso11 commented Nov 8, 2023 • edited by ch1bo Loading

ch1bo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Quantumplation commented Dec 5, 2023

ghost commented Dec 6, 2023

ghost left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ch1bo commented Dec 12, 2023

Quantumplation commented Dec 13, 2023

ghost commented Dec 13, 2023

ghost left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ch1bo left a comment

Choose a reason for hiding this comment

Quantumplation left a comment

Choose a reason for hiding this comment

cardenaso11 commented Nov 8, 2023 •

edited by ch1bo

Loading